Herbicide target genes and methods

ABSTRACT

The invention relates to genes isolated from Arabidopsis that code for proteins essential for seedling growth. The invention also includes the methods of using these proteins to discover new herbicides, based on the essentiality of these genes for normal growth and development. The invention can also be used in a screening assay to identify inhibitors that are potential herbicides. The invention is also applied to the development of herbicide tolerant plants, plant tissues, plant seeds, and plant cells.

[0001] The invention relates to genes isolated from Arabidopsis that code for proteins essential for seedling growth. The invention also includes the methods of using these proteins as herbicide targets, based on the essentiality of the genes for normal growth and development. The invention is also useful as a screening assay to identify inhibitors that are potential herbicides. The invention may also be applied to the development of herbicide tolerant plants, plant tissues, plant seeds, and plant cells.

[0002] The use of herbicides to control undesirable vegetation such as weeds in crop fields has become almost a universal practice. The herbicide market exceeds 15 billion dollars annually. Despite this extensive use, weed control remains a significant and costly problem for farmers. Effective use of herbicides requires sound management. For instance, the time and method of application and stage of weed plant development are critical to getting good weed control with herbicides. Since various weed species are resistant to herbicides, the production of effective new herbicides becomes increasingly important. Novel herbicides can now be discovered using high-throughput screens that implement recombinant DNA technology. Metabolic enzymes found to be essential to plant growth and development can be recombinantly produced through standard molecular biological techniques and utilized as herbicide targets in screens for novel inhibitors of the enzyme activity. The novel inhibitors discovered through such screens may then be used as herbicides to control undesirable vegetation.

[0003] Herbicides that exhibit greater potency, broader weed spectrum, and more rapid degradation in soil can also, unfortunately, have greater crop phytotoxicity. One solution applied to this problem has been to develop crops that are resistant or tolerant to herbicides. Crop hybrids or varieties tolerant to the herbicides allow for the use of the herbicides to kill weeds without attendant risk of damage to the crop. Development of tolerance can allow application of a herbicide to a crop where its use was previously precluded or limited (e.g. to pre-emergence use) due to sensitivity of the crop to the herbicide. For example, U.S. Pat. No. 4,761,373 to Anderson et al. is directed to plants resistant to various imidazolinone or sulfonamide herbicides. An altered acetohydroxyacid synthase (AHAS) enzyme confers the resistance. U.S. Pat. No. 4,975,374 to Goodman et al. relates to plant cells and plants containing a gene encoding a mutant glutamine synthetase (GS) resistant to inhibition by herbicides that were known to inhibit GS, e.g. phosphinothricin and methionine sulfoximine. U.S. Pat. No. 5,013,659 to Bedbrook et al. is directed to plants expressing a mutant acetolactate synthase that renders the plants resistant to inhibition by sulfonylurea herbicides. U.S. Pat. No. 5,162,602 to Somers et al. discloses plants tolerant to inhibition by cyclohexanedione and aryloxyphenoxypropanoic acid herbicides. The tolerance is conferred by an altered acetyl coenzyme A carboxylase (ACCase).

[0004] Notwithstanding the above described advancements, there remain persistent and ongoing problems with unwanted or detrimental vegetation growth (e.g. weeds). Furthermore, as the population continues to grow, there will be increasing food shortages. Therefore, there exists a long felt, yet unfulfilled need, to find new, effective, and economic herbicides.

[0005] It is an object of the invention to provide effective and beneficial methods to identify novel herbicides. A feature of the invention is the identification of genes in Arabidopsis, herein referred to as the GT1802 gene, which encodes a protein with sequence similarity to a subunit of the cytochrome B6-F complex, (Madueno et al. (1992) Plant Mol. Biol. 20: 289-299; Steppuhn et al. (1987) Mol. Gen. Genetics 210: 171-177; Salter et al. (1992) Plant Mol. Biol. 20: 569-574); the GT1209 gene, which encodes a protein with no known function but may play a role as a subunit in an anaphase-promoting complex (Yu et al. (1998) Science 279: 1219-1222); the GT1354 gene, which encodes a protein with no known function: and the GT0946 gene, which encodes a 4-Diphosphocytidyl-2C-methyl-D-erythritol synthase (Genbank accession number AF230737; Salerno (1986) Plant Sci. 44: 111-117; Coates et al. (1980) J. Biol. Chem. 256: 9225-9229; Follens et al. (1999) J. Bacteriol. 181: 2001-2007; Rohdich et al. (1999) Proc. Natl. Acad. Sci. 96: 11758-11763; Luttgen et al. (2000) Proc. Natl. Acad. Sci. 97: 1062-1067; Herz et al. (2000) Proc. Natl. Acad. Sci. 94: 2487-2490). An important and unexpected feature of the invention is the discovery that each of these genes is essential for seedling growth and development. An advantage of the present invention is that the newly discovered essential genes containing novel herbicidal modes of action enable one skilled in the art to easily and rapidly identify novel herbicides.

[0006] One object of the present invention is to provide essential genes in plants for assay development for inhibitory compounds with herbicidal activity. Genetic results show that when either the GT1802, GT1209, GT1354, or GT0946 genes are mutated in Arabidopsis, the resulting phenotype is seedling lethal in the homozygous state. This suggests a critical role for the gene products encoded by each of these genes.

[0007] Using Ac/Ds transposon mutagenesis, the inventors of the present invention have demonstrated that the activity encoded by the Arabidopsis GT1802, GT1209, GT1354, or GT0946 genes (herein referred to as GT1802, GT1209, GT1354, or GT0946 activity) is essential in Arabidopsis seedlings. This implies that chemicals that inhibit the function of any one of these proteins in plants are likely to have detrimental effects on plants and are potentially good herbicide candidates. The present invention therefore provides methods of using a purified protein encoded by any one of the gene sequences described below to identify inhibitors thereof, which can then be used as herbicides to suppress the growth of undesirable vegetation, e.g. in fields where crops are grown, particularly agronomically important crops such as maize and other cereal crops such as wheat, oats, rye, sorghum, rice, barley, millet, turf and forage grasses, and the like, as well as cotton, sugar cane, sugar beet, oilseed rape, and soybeans.

[0008] The present invention discloses a nucleotide sequence derived from Arabidopsis, designated the GT1802 gene. The nucleotide sequence of the cDNA clone is set forth in SEQ ID NO:1, and the corresponding amino acid sequence is set forth in SEQ ID NO:2. The nucleotide sequence of the genomic DNA sequence is set forth in SEQ ID NO:9. Also, the present invention discloses a nucleotide sequence derived from Arabidopsis, designated the GT1209 gene. The nucleotide sequence of the cDNA clone is set forth in SEQ ID NO:3, and the corresponding amino acid sequence is set forth in SEQ ID NO:4. The nucleotide sequence of the genomic DNA sequence is set forth in SEQ ID NO:10. Furthermore, the present invention discloses a nucleotide sequence derived from Arabidopsis, designated the GT1354 gene. The nucleotide sequence of the cDNA clone is set forth in SEQ ID NO:5, and the corresponding amino acid sequence is set forth in SEQ ID NO:6. The nucleotide sequence of the genomic DNA sequence is set forth in SEQ ID NO:11. Furthermore, the present invention discloses a nucleotide sequence derived from Arabidopsis, designated the GT0946 gene. The nucleotide sequence of the cDNA clone is set forth in SEQ ID NO:7, and the corresponding amino acid sequence is set forth in SEQ ID NO:8. The nucleotide sequence of the genomic DNA sequence is set forth in SEQ ID NO:12. The present invention also includes nucleotide sequences substantially similar to those set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, and SEQ ID NO:7. The present invention also encompasses plant proteins whose amino acid sequence are substantially similar to the amino acid sequences set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, and SEQ ID NO:8. Such proteins can be used in a screening assay to identify inhibitors that are potential herbicides.

[0009] In a preferred embodiment, the present invention relates to a method for identifying chemicals having the ability to inhibit GT1802, GT1209, GT1354, or GT0946 activity in plants preferably comprising the steps of: a) obtaining transgenic plants, plant tissue, plant seeds or plant cells, preferably stably transformed, comprising a non-native nucleotide sequence encoding an enzyme having GT1802, GT1209, GT1354, or GT0946 activity, respectively, and capable of overexpressing an enzymatically active GT1802, GT1209, GT1354, or GT0946 gene product (either full length or truncated but still active), respectively; b) applying a chemical to the transgenic plants, plant cells, tissues or parts and to the isogenic non-transformed plants, plant cells, tissues or parts; c) determining the growth or viability of the transgenic and non-transformed plants, plant cells, tissues after application of the chemical; d) comparing the growth or viability of the transgenic and non-transformed plants, plant cells, tissues after application of the chemical; and e) selecting chemicals that suppress the viability or growth of the non-transgenic plants, plant cells, tissues or parts, without significantly suppressing the growth of the viability or growth of the isogenic transgenic plants, plant cells, tissues or parts. In a preferred embodiment, the enzyme having GT1802, GT1209, GT1354, or GT0946 activity is encoded by a nucleotide sequence derived from a plant, preferably a monocotyledonous or a dicotyledonous plant, preferably a dicotyledonous plant, preferably Arabidopsis thaliana, desirably identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively. In another embodiment, the enzyme having GT1802, GT1209, GT1354, or GT0946 activity is encoded by a nucleotide sequence capable of encoding the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, respectively. In yet another embodiment, the enzyme having GT1802, GT1209, GT1354, or GT0946 activity has an amino acid sequence identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, respectively.

[0010] The present invention further embodies plants, plant tissues, plant seeds, and plant cells that have modified GT1802, GT1209, GT1354, or GT0946 activity and that are therefore tolerant to inhibition by a herbicide at levels normally inhibitory to naturally occurring GT1802, GT1209, GT1354, or GT0946 activity, respectively. Herbicide tolerant plants encompassed by the invention include those that would otherwise be potential targets for normally inhibiting herbicides, particularly the agronomically important crops mentioned above. According to this embodiment, plants, plant tissue, plant seeds, or plant cells are transformed, preferably stably transformed, with a recombinant DNA molecule comprising a suitable promoter functional in plants operatively linked to a nucleotide coding sequence that encodes a modified GT1802, GT1209, GT1354, or GT0946 gene that is tolerant to inhibition by a herbicide at a concentration that would normally inhibit the activity of wild-type, unmodified GT1802, GT1209, GT1354, or GT0946 gene product, respectively. Modified GT1802, GT1209, GT1354, or GT0946 activity may also be conferred upon a plant by increasing expression of wild-type herbicide-sensitive GT1802, GT1209, GT1354, or GT0946 protein by providing multiple copies of wild-type GT1802, GT1209, GT1354, or GT0946 genes, respectively, to the plant or by overexpression of wild-type GT1802, GT1209, GT1354, or GT0946 genes, respectively, under control of a stronger-than-wild-type promoter. The transgenic plants, plant tissue, plant seeds, or plant cells thus created are then selected by conventional selection techniques, whereby herbicide tolerant lines are isolated, characterized, and developed. Alternately, random or site-specific mutagenesis may be used to generate herbicide tolerant lines.

[0011] Therefore, the present invention provides a plant, plant cell, plant seed, or plant tissue transformed with a DNA molecule comprising a nucleotide sequence isolated from a plant that encodes an enzyme having GT1802, GT1209, GT1354, or GT0946 activity, wherein the DNA expresses the GT1802, GT1209, GT1354, or GT0946 activity, respectively, and wherein the DNA molecule confers upon the plant, plant cell, plant seed, or plant tissue tolerance to a herbicide in amounts that normally inhibits naturally occurring GT1802, GT1209, GT1354, or GT0946 activity, respectively. According to one example of this embodiment, the enzyme having GT1802, GT1209, GT1354, or GT0946 activity is encoded by a nucleotide sequence identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3 SEQ ID NO:5, or SEQ ID NO:7, respectively, or has an amino acid sequence identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, respectively.

[0012] The invention also provides a method for suppressing the growth of a plant comprising the step of applying to the plant a chemical that inhibits the naturally occurring GT1802, GT1209, GT354, or GT0946 activity in the plant. In a related aspect, the present invention is directed to a method for selectively suppressing the growth of undesired vegetation in a field containing a crop of planted crop seeds or plants, comprising the steps of: (a) optionally planting herbicide tolerant crops or crop seeds, which are plants or plant seeds that are tolerant to a herbicide that inhibits the naturally occurring GT1802, GT1209, GT1354, or GT0946 activity; and (b) applying to the herbicide tolerant crops or crop seeds and the undesired vegetation in the field a herbicide in amounts that inhibit naturally occurring GT1802, GT1209, GT1354, or GT0946 activity, respectively, wherein the herbicide suppresses the growth of the weeds without significantly suppressing the growth of the crops.

[0013] The invention thus provides:

[0014] An isolated DNA molecule comprising a nucleotide sequence substantially similar to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7. In a preferred embodiment, the nucleotide sequence encodes an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8. In another preferred embodiment, the nucleotide sequence is SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7. In yet another preferred embodiment, the nucleotide sequence encodes the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8. Preferably, the nucleotide sequence is a plant nucleotide sequence, which preferably encodes a polypeptide having GT1802, GT1209, GT1354, or GT0946 activity, respectively.

[0015] The invention further provides:

[0016] A polypeptide comprising an amino acid sequence encoded by a nucleotide sequence substantially similar to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7. Preferably, the amino acid sequence is encoded by SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7. Preferably, the polypeptide comprises an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, respectively. Preferably the amino acid sequence is SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8. The amino acid sequence preferably has GT1802, GT1209, GT1354, or GT0946 activity, respectively. In another preferred embodiment, the amino acid sequence comprises at least 20 consecutive amino acid residues of the amino acid sequence encoded by SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively. Or, alternatively, the amino acid sequence comprises at least 20 consecutive amino acid residues of the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, respectively.

[0017] The invention further provides:

[0018] An expression cassette comprising a promoter operatively linked to a DNA molecule according to the present invention, a recombinant vector comprising an expression cassette according to the present invention, wherein said vector is preferably capable of being stably transformed into a host cell, a host cell comprising a DNA molecule according to the present invention, wherein said DNA molecule is preferably expressible in the cell. The host cell is preferably selected from the group consisting of an insect cell, a yeast cell, a prokaryotic cell and a plant cell. The invention further provides a plant or seed comprising a plant cell of the present invention, wherein the plant or seed is preferably tolerant to an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity, respectively.

[0019] The invention further provides:

[0020] A process for making nucleotides sequences encoding gene products having altered GT1802, GT1209, GT1354, or GT0946 activity, comprising: a) shuffling an unmodified nucleotide sequence of the present invention, b) expressing the resulting shuffled nucleotide sequences, and c) selecting for altered GT1802, GT1209, GT1354, or GT0946 activity, respectively, as compared to the GT1802, GT1209, GT1354, or GT0946 activity, respectively, of the gene product of said unmodified nucleotide sequence.

[0021] In a preferred embodiment, the unmodified nucleotide sequence is identical or substantially similar to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, or a homolog thereof. The present invention further provides a DNA molecule comprising a shuffled nucleotide sequence obtainable by the process described above, a DNA molecule comprising a shuffled nucleotide sequence produced by the process described above. Preferably, a shuffled nucleotide sequence obtained by the process described above has enhanced tolerance to an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity, respectively. The invention further provides an expression cassette comprising a promoter operatively linked to a DNA molecule comprising a shuffled nucleotide sequence a recombinant vector comprising such an expression cassette, wherein said vector is preferably capable of being stably transformed into a host cell, a host cell comprising such an expression cassette, wherein said nucleotide sequence is preferably expressible in said cell. A preferred host cell is selected from the group consisting of an insect cell, a yeast cell, a prokaryotic cell and a plant cell. The invention further provides a plant or seed comprising such plant cell, wherein the plant is preferably tolerant to an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity, respectively.

[0022] The invention further provides:

[0023] A method for selecting compounds that interact with the protein encoded by SEQ ID NO:1 SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, comprising: a) expressing a DNA molecule comprising SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, or a sequence substantially similar to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, or a homolog thereof, to generate the corresponding protein, b) testing a compound suspected of having the ability to interact with the protein expressed in step (a), and c) selecting compounds that interact with the protein in step (b).

[0024] The invention further provides:

[0025] A process of identifying an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity, respectively, comprising: a) introducing a DNA molecule comprising a nucleotide sequence of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, and having GT1802, GT1209, GT1354, or GT0946 activity, respectively, or nucleotide sequences substantially similar thereto, or a homolog thereof, into a plant cell, such that said sequence is functionally expressible at levels that are higher than wild-type expression levels, b) combining said plant cell with a compound to be tested for the ability to inhibit the GT1802, GT1209, GT1354, or GT0946 activity, respectively, under conditions conducive to such inhibition, c) measuring plant cell growth under the conditions of step (b), d) comparing the growth of said plant cell with the growth of a plant cell having unaltered GT1802, GT1209, GT1354, or GT0946 activity, respectively, under identical conditions, and e) selecting said compound that inhibits plant cell growth in step (d). The invention further comprises a compound having herbicidal activity identifiable according to the process described immediately above.

[0026] The invention further comprises:

[0027] A process of identifying compounds having herbicidal activity comprising: a) combining a protein of the present invention and a compound to be tested for the ability to interact with said protein, under conditions conducive to interaction, b) selecting a compound identified in step (a) that is capable of interacting with said protein, c) applying identified compound in step (b) to a plant to test for herbicidal activity, and d) selecting compounds having herbicidal activity. The invention further comprises a compound having herbicidal activity identifiable according to the process described immediately above.

[0028] The invention further comprises:

[0029] A method for suppressing the growth of a plant comprising, applying to said plant a compound that inhibits the activity of a polypeptide of the present invention in an amount sufficient to suppress the growth of said plant.

[0030] The invention further comprises:

[0031] A method for recombinantly expressing a protein having GT1802, GT1209, GT1354, or GT0946 activity comprising introducing a nucleotide sequence encoding a protein having one of the above activities into a host cell and expressing the nucleotide sequence in the host cell. A preferred host cell is selected from the group consisting of an insect cell, a yeast cell, a prokaryotic cell and a plant cell. A preferred prokaryotic cell is a bacterial cell, e.g. E. coli.

[0032] Other objects and advantages of the present invention will become apparent to those skilled in the art from a study of the following description of the invention and non-limiting examples.

DEFINITIONS

[0033] For clarity, certain terms used in the specification are defined and presented as follows:

[0034] Co-factor: natural reactant, such as an organic molecule or a metal ion, required in an enzyme-catalyzed reaction. A co-factor is e.g. NAD(P), riboflavin (including FAD and FMN), folate, molybdopterin, thiamin, biotin, lipoic acid, pantothenic acid and coenzyme A, S-adenosylmethionine, pyridoxal phosphate, ubiquinone, menaquinone. Optionally, a co-factor can be regenerated and reused. DNA shuffling: DNA shuffling is a method to rapidly, easily and efficiently introduce mutations or rearrangements, preferably randomly, in a DNA molecule or to generate exchanges of DNA sequences between two or more DNA molecules, preferably randomly. The DNA molecule resulting from DNA shuffling is a shuffled DNA molecule that is a non-naturally occurring DNA molecule derived from at least one template DNA molecule. The shuffled DNA encodes an enzyme modified with respect to the enzyme encoded by the template DNA, and preferably has an altered biological activity with respect to the enzyme encoded by the template DNA.

[0035] Enzyme activity: means herein the ability of an enzyme to catalyze the conversion of a substrate into a product. A substrate for the enzyme comprises the natural substrate of the enzyme but also comprises analogues of the natural substrate, which can also be converted, by the enzyme into a product or into an analogue of a product. The activity of the enzyme is measured for example by determining the amount of product in the reaction after a certain period of time, or by determining the amount of substrate remaining in the reaction mixture after a certain period of time. The activity of the enzyme is also measured by determining the amount of an unused co-factor of the reaction remaining in the reaction mixture after a certain period of time or by determining the amount of used co-factor in the reaction mixture after a certain period of time. The activity of the enzyme is also measured by determining the amount of a donor of free energy or energy-rich molecule (e.g. ATP, phosphoenolpyruvate, acetyl phosphate or phosphocreatine) remaining in the reaction mixture after a certain period of time or by determining the amount of a used donor of free energy or energy-rich molecule (e.g. ADP, pyruvate, acetate or creatine) in the reaction mixture after a certain period of time. “GT1802 Gene” as used herein refers to a DNA molecule comprising a nucleotide sequence encoding SEQ ID NO:2, or a nucleotide sequence substantially similar thereto. Preferably, the nucleotide sequence is set forth in SEQ ID NO:1 or is substantially similar to SEQ ID NO:1. “GT1209 Gene” as used herein refers to a DNA molecule comprising a nucleotide sequence encoding SEQ ID NO:4, or a nucleotide sequence substantially similar thereto. Preferably, the nucleotide sequence is set forth in SEQ ID NO:3 or is substantially similar to SEQ ID NO:3. “GT1354 Gene” as used herein refers to a DNA molecule comprising a nucleotide sequence encoding SEQ ID NO:6, or a nucleotide sequence substantially similar thereto. Preferably, the nucleotide sequence is set forth in SEQ ID NO:5 or is substantially similar to SEQ ID NO:5. “GT0946 Gene” as used herein refers to a DNA molecule comprising a nucleotide sequence encoding SEQ ID NO:8, or a nucleotide sequence substantially similar thereto. Preferably, the nucleotide sequence is set forth in SEQ ID NO:7 or is substantially similar to SEQ ID NO:7.

[0036] Herbicide: a chemical substance used to kill or suppress the growth of plants, plant cells, plant seeds, or plant tissues.

[0037] Heterologous DNA Sequence: a DNA sequence not naturally associated with a host cell into which it is introduced, including non-naturally occurring multiple copies of a naturally occurring DNA sequence; and genetic constructs wherein an otherwise homologous DNA sequence is operatively linked to a non-native sequence.

[0038] Homologous DNA Sequence: a DNA sequence naturally associated with a host cell into which it is introduced.

[0039] Inhibitor: a chemical substance that causes abnormal growth, e.g., by inactivating the enzymatic activity of a protein such as a biosynthetic enzyme, receptor, signal transduction protein, structural gene product, or transport protein that is essential to the growth or survival of the plant. In the context of the instant invention, an inhibitor is a chemical substance that act alters the enzymatic activity encoded by the GT1802, GT1209, GT1354, or GT0946 gene from a plant. More generally, an inhibitor causes abnormal growth of a host cell by interacting with the gene product encoded by the GT1802, GT1209, GT1354, or GT0946 gene.

[0040] Isogenic: plants which are genetically identical, except that they may differ by the presence or absence of a heterologous DNA sequence.

[0041] Isolated: in the context of the present invention, an isolated DNA molecule or an isolated enzyme is a DNA molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated DNA molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, in a transgenic host cell.

[0042] Mature protein: protein which is normally targeted to a cellular organelle, such as a chloroplast, and from which the transit peptide has been removed.

[0043] Minimal Promoter: promoter elements, particularly a TATA element, that are inactive or that have greatly reduced promoter activity in the absence of upstream activation. In the presence of a suitable transcription factor, the minimal promoter functions to permit transcription.

[0044] Modified Enzyme Activity: enzyme activity different from that which naturally occurs in a plant (i.e. enzyme activity that occurs naturally in the absence of direct or indirect manipulation of such activity by man), which is tolerant to inhibitors that inhibit the naturally occurring enzyme activity.

[0045] Pre-protein: protein which is normally targeted to a cellular organelle, such as a chloroplast, and still comprising its transit peptide.

[0046] Significant Increase: an increase in enzymatic activity that is larger than the margin of error inherent in the measurement technique, preferably an increase by about 2-fold or greater of the activity of the wild-type enzyme in the presence of the inhibitor, more preferably an increase by about 5-fold or greater, and most preferably an increase by about 10-fold or greater.

[0047] Significantly less: means that the amount of a product of an enzymatic reaction is reduced by more than the margin of error inherent in the measurement technique, preferably a decrease by about 2-fold or greater of the activity of the wild-type enzyme in the absence of the inhibitor, more preferably an decrease by about 5-fold or greater, and most preferably an decrease by about 10-fold or greater.

[0048] In its broadest sense, the term “substantially similar”, when used herein with respect to a nucleotide sequence, means a nucleotide sequence corresponding to a reference nucleotide sequence, wherein the corresponding sequence encodes a polypeptide having substantially the same structure and function as the polypeptide encoded by the reference nucleotide sequence. Desirably the substantially similar nucleotide sequence encodes the polypeptide encoded by the reference nucleotide sequence. The term “substantially similar” is specifically intended to include nucleotide sequences wherein the sequence has been modified to optimize expression in particular cells. In the context of the “GT1802 gene”, “substantially similar” refers to nucleotide sequences that encode a protein at least 79% identical, still more preferably at least 85% identical, still more preferably at least 90% identical, still more preferably at least 95% identical, yet still more preferably at least 99% identical to SEQ ID NO:2; in the context of the “GT1209 gene”, “substantially similar” refers to nucleotide sequences that encode a protein at least 39% identical, more preferably at least 50% identical, still more preferably at least 60% identical, still more preferably at least 85% identical, still more preferably at least 95% identical, yet still more preferably at least 99% identical to SEQ ID NO:4; in the context of the “GT1354 gene”, “substantially similar” refers to nucleotide sequences that encode a protein at least 42% identical, more preferably at least 55% identical, more preferably at least 65% identical, still more preferably at least 75% identical, still more preferably at least 85% identical, still more preferably at least 95% identical, yet still more preferably at least 99% identical to SEQ ID NO:6; in the context of the “GT0946 gene”, “substantially similar” refers to nucleotide sequences that encode a protein at least 85% identical, still more preferably at least 90% identical, still more preferably at least 95% identical, yet still more preferably at least 99% identical to SEQ ID NO:8, wherein said protein sequence comparisons are conducted using GAP analysis as described below. A nucleotide sequence “substantially similar” to the reference nucleotide sequence hybridizes to the reference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 2×SSC, 0.1% SDS at 50° C., more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 1×SSC, 0.1% SDS at 50° C., more desirably still in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.5×SSC, 0.1% SDS at 50° C., preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC, 0.1% SDS at 50° C., more preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC, 0.1% SDS at 65° C.

[0049] “Homologs of the GT1802 gene” include nucleotide sequences that encode an amino acid sequence that is at least 40% identical to SEQ ID NO:2, more preferably at least 55% identical yet still more preferably at least 70% identical to SEQ ID NO:2, as measured, using the GAP parameters described below, wherein the amino acid sequence encoded by the homolog has the biological activity of the GT1802 protein.

[0050] “Homologs of the GT1209 gene” include nucleotide sequences that encode an amino acid sequence that is at least 30% identical to SEQ ID NO:4, more preferably at least 40% identical, still more preferably at least 50% identical, still more preferably at least 60% identical, yet still more preferably at least 80% identical to SEQ ID NO:4, as measured, using the GAP parameters described below, wherein the amino acid sequence encoded by the homolog has the biological activity of the GT1209 protein.

[0051] “Homologs of the GT1354 gene” include nucleotide sequences that encode an amino acid sequence that is at least 30% identical to SEQ ID NO:6, still more preferably at least 40% identical, yet still more preferably at least 60% identical to SEQ ID NO:6, as measured, using the GAP parameters described below, wherein the amino acid sequence encoded by the homolog has the biological activity of the GT1354 protein.

[0052] “Homologs of the GT0946 gene” include nucleotide sequences that encode an amino acid sequence that is at least 30% identical to SEQ ID NO:8, still more preferably at least 50% identical, yet still more preferably at least 60% identical, yet still more preferably at least 80% identical to SEQ ID NO:8, as measured, using the GAP parameters described below, wherein the amino acid sequence encoded by the homolog has the biological activity of the GT0946 protein.

[0053] The term “substantially similar”, when used herein with respect to a protein, means a protein corresponding to a reference protein, wherein the protein has substantially the same structure and function as the reference protein, e.g. where only changes in amino acids sequence not affecting the polypeptide function occur. When used in the context of the “GT1802 gene”, the percentage of identity between the substantially similar protein or amino acid sequence and the reference protein or amino acid sequence (in this case SEQ ID NO:2) is at least 79%, more preferably at least 85%, still more preferably at least 90%, still more preferably at least 95%, yet still more preferably at least 99%, as determined using default GAP analysis parameters with the University of Wisconsin GCG, SEQWEB application of GAP, based on the algorithm of Needleman and Wunsch (Needleman and Wunsch (1970) J Mol. Biol. 48: 443-453). In the context of the “GT1209 gene”, the percentage of identity between the substantially similar protein or amino acid sequence and the reference protein or amino acid sequence (in this case SEQ ID NO:4) is at least 39%, more preferably at least 50%, still more preferably at least 60%, still more preferably at least 85%, still more preferably at least 95%, yet still more preferably at least 99%. In the context of the “GT1354 gene”, the percentage of identity between the substantially similar protein or amino acid sequence and the reference protein or amino acid sequence (in this case SEQ ID NO:6) is at least 42%, more preferably at least 55%, more preferably at least 65%, still more preferably at least 75%, still more preferably at least 85%, still more preferably at least 95%, yet still more preferably at least 99%. In the context of the “GT0946 gene”, the percentage of identity between the substantially similar protein or amino acid sequence and the reference protein or amino acid sequence (in this case SEQ ID NO:8) is at least 85%, more preferably at least 90%, more preferably at least 95%, yet still more preferably at least 99%.

[0054] As used herein the term “GT1802 protein” refers to an amino acid sequence encoded by a DNA molecule comprising a nucleotide sequence substantially similar to SEQ ID NO:1. “Homologs of the GT1802 protein” are amino acid sequences that are at least 40% identical to SEQ ID NO:2, more preferably at least 55% identical, yet still more preferably at least 70% identical to SEQ ID NO:2, as measured using the GAP parameters described above, wherein the homologs of the GT1802 protein have the biological activity of the GT1802 protein.

[0055] As used herein the term “GT1209 protein” refers to an amino acid sequence encoded by a DNA molecule comprising a nucleotide sequence substantially similar to SEQ ID NO:3. “Homologs of the GT1209 protein” are amino acid sequences that are at least 30% identical to SEQ ID NO:4, more preferably at least 40% identical, still more preferably at least 50% identical, still more preferably at least 60% identical, yet still more preferably at least 80% identical to SEQ ID NO:4, as measured using the GAP parameters described above, wherein the homologs of the GT1209 protein have the biological activity of the GT1209 protein.

[0056] As used herein the term “GT1354 protein” refers to an amino acid sequence encoded by a DNA molecule comprising a nucleotide sequence substantially similar to SEQ ID NO:5. “Homologs of the GT1354 protein” are amino acid sequences that are at least 30% identical to SEQ ID NO:6, still more preferably at least 40% identical, yet still more preferably at least 60% identical to SEQ ID NO:6, as measured using the GAP parameters described above, wherein the homologs of the GT1354 protein have the biological activity of the GT1354 protein.

[0057] As used herein the term “GT0946 protein” refers to an amino acid sequence encoded by a DNA molecule comprising a nucleotide sequence substantially similar to SEQ ID NO:7. “Homologs of the GT0946 protein” are amino acid sequences that are at least 30% identical to SEQ ID NO:8, still more preferably at least 50% identical, yet still more preferably at least 60%, yet still more preferably at least 80% identical to SEQ ID NO:8, as measured using the GAP parameters described above, wherein the homologs of the GT0946 protein have the biological activity of the GT0946 protein.

[0058] Substrate: a substrate is the molecule that an enzyme naturally recognizes and converts to a product in the biochemical pathway in which the enzyme naturally carries out its function, or is a modified version of the molecule, which is also recognized by the enzyme and is converted by the enzyme to a product in an enzymatic reaction similar to the naturally-occuring reaction.

[0059] Tolerance: the ability to continue essentially normal growth or function (i.e. no more than 5% of herbicide tolerant plants show phytotoxicity) when exposed to an inhibitor or herbicide in an amount sufficient to suppress the normal growth or function of native, unmodified plants.

[0060] Transformation: a process for introducing heterologous DNA into a cell, tissue, or plant.

[0061] Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.

[0062] Transgenic: stably transformed with a recombinant DNA molecule that preferably comprises a suitable promoter operatively linked to a DNA sequence of interest.

BRIEF DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING

[0063] SEQ ID NO:1 cDNA coding sequence for the Arabidopsis GT1802 gene

[0064] SEQ ID NO:2 amino acid sequence encoded by the Arabidopsis GT1802 nucleotide sequence shown in SEQ ID NO:1

[0065] SEQ ID NO:3 cDNA coding sequence of the Arabidopsis GT1209 gene

[0066] SEQ ID NO:4 amino acid sequence encoded by the Arabidopsis GT1209 nucleotide sequence shown in SEQ ID NO:3

[0067] SEQ ID NO:5 cDNA coding sequence for the Arabidopsis GT1354 gene

[0068] SEQ ID NO:6 amino acid sequence encoded by the Arabidopsis GT1354 nucleotide sequence shown in SEQ ID NO:5

[0069] SEQ ID NO:7 cDNA coding sequence for the Arabidopsis GT0946 gene

[0070] SEQ ID NO:8 amino acid sequence encoded by the Arabidopsis nucleotide sequence shown in SEQ ID NO:7

[0071] SEQ ID NO:9 genomic sequence of the Arabidopsis GT1802 gene

[0072] SEQ ID NO:10 genomic sequence of the Arabidopsis GT1209 gene

[0073] SEQ ID NO:11 genomic sequence of the Arabidopsis GT1354 gene

[0074] SEQ ID NO:12 genomic sequence of the Arabidopsis GT0946 gene

[0075] SEQ ID NO:13 oligonucleotide LWAD1

[0076] SEQ ID NO:14 oligonucleotide CA51

[0077] SEQ ID NO:15 oligonucleotide CA52

[0078] SEQ ID NO:16 oligonucleotide CA53

[0079] SEQ ID NO:17 oligonucleotide CA54

[0080] SEQ ID NO:18 oligonucleotide CA55

[0081] SEQ ID NO:19 oligonucleotide 5A

[0082] SEQ ID NO:20 oligonucleotide 5B

[0083] SEQ ID NO:21 oligonucleotide 5C

[0084] SEQ ID NO:22 oligonucleotide 3A

[0085] SEQ ID NO:23 oligonucleotide 3B

[0086] SEQ ID NO:24 oligonucleotide 3C

[0087] SEQ ID NO:25 second cDNA coding sequence for the Arabidopsis GT1209 gene

[0088] SEQ ID NO:26 amino acid sequence encoded by the Arabidopsis GT1209 nucleotide sequence shown in SEQ ID NO:25

[0089] I. Essentiality of the GT1802, GT1209, GT1354, and GT0946 Genes in Arabidopsis Demonstrated by Ac/Ds Transposon Mutagenesis

[0090] As shown in the examples below, the identification of a novel gene structure, as well as the essentiality of the GT1802, GT1209, GT1354, or GT0946 genes for normal plant growth and development, have been demonstrated for the first time in Arabidopsis using Ac/Ds transposon mutagenesis. Having established the essentiality of GT1802, GT1209, GT1354, and GT0946 functions in plants and having identified the genes encoding these essential activities, the inventors thereby provide an important and sought after tool for new herbicide development.

[0091] Arabidopsis insertional mutant lines segregating for seedling lethal mutations are identified as a first step in the identification of essential proteins. Ds transposon insertion lines were produced as described in Sundareson et al. (1995) Genes and Dev., 9:1797-1810), incorporated herein by reference. Starting with F3 or F4 seeds collected from single F2 or F3 kanamycin-resistant plants containing Ds insertions in their genomes (see FIG. 3 of Sundareson et al. (1995) Genes and Dev., 9:1797-1810), those lines segregating homozygous seedling lethal seedlings are identified. These lines are found by placing seeds onto minimal plant growth media, which contains the fungicides benomyl and maxim, and screening for inviable seedlings after 7 and 14 days in the light at room temperature. Inviable phenotypes include altered pigmentation or altered morphology. These phenotypes are observed either on plates directly or in soil following transplantation of seedlings.

[0092] When a line is identified as segregating a seedling lethal, it is determined if the resistance marker in the Ds transposon insertion co-segregates with the lethality (Errampalli et al. (1991) The Plant Cell, 3:149-157). Co-segregation analysis is done by placing the seeds on media containing the selective agent and scoring the seedlings for resistance or sensitivity to the agent. Examples of selective agents used are kanamycin, hygromycin, or phosphinothricin. About 35 resistant seedlings are transplanted to soil and their progeny are examined for the segregation of the seedling lethal. In the case in which the Ds transposon insertion disrupts an essential gene, there is co-segregation of the resistance phenotype and the seedling lethal phenotype in every plant. Therefore, in such a case, all resistant plants segregate seedling lethals in the next generation; this result indicates that each of the resistant plants is heterozygous for the DNA causing both phenotypes.

[0093] For the Arabidopsis lines showing co-segregation of the transposon-encoded resistance marker and the lethal phenotype, PCR-based molecular approaches such as, TAIL-PCR (Liu et al. (1995) The Plant Journal, 8:457-463; Liu and Whittier (1995), Genomics, 25: 674-681), vectorette PCR (Riley et al. (1990) Nucleic Acids Research, 18: 2887-2890)), or the GenomeWalker™ kit (CLONTECH Laboratories, Inc., Palo Alto, Calif.), may be used to directly amplify the plant DNA fragments flanking the transposon. Each of these techniques utilizes the known sequence of the transposon, and can be used to recover small (less than 5 KB) fragments directly adjacent to the insertion. PCR products are isolated and their DNA sequence is determined. The resulting sequences are analyzed for the presence of non-Ds transposon vector sequences. When such sequences are found, they are used to search DNA and protein databases using the BLAST and BLAST2 programs (Altschul et al. (1990) J Mol. Biol. 215: 403-410; Altschul et al (1997) Nucleic Acid Res. 25:3389-3402, both incorporated herein by reference). Additional genomic and cDNA sequences for each gene are identified by standard molecular biology procedures.

[0094] II. Sequence of the Arabidopsis GT1802 Gene

[0095] The Arabidopsis GT1802 gene is identified by isolating DNA flanking the Ds transposon border from the tagged seedling-lethal line #GT1802. A region of the Arabidopsis DNA flanking the Ds transposon border corresponds to Arabidopsis genomic sequence (chromosome 4 of BAC F4C21, GenBank accession #AC005275). The inventors are the first to demonstrate that the GT1802 gene product is essential for normal growth and development in plants, as well as defining the function of the GT1802 gene through protein homology. The present invention discloses the cDNA coding nucleotide sequence of the Arabidopsis GT1802 gene as well as the amino acid sequence of the Arabidopsis GT 1802 protein.

[0096] The present invention also encompasses an isolated amino acid sequence derived from a plant, wherein said amino acid sequence is identical or substantially similar to the amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO:1, wherein said amino acid sequence has GT1802 activity. Using BLAST and BLAST2 programs with the default settings, the sequence of the GT1802 gene shows similarity to a subunit of the cytochrome B6-F complex, from Chlamydomonas reinhardtii (GenPept Accession #CAA53947), Oryza sativa (GenPept Accession #AAC78103), Synechocystis (GenPept Accession #CAA41421), Pisum sativum (GenPept Accession #CAA45151 and Genbank Accession #X63605), Spinacia oleracea (GenPept Accession #CAA29590), Nicotiana tabacum (SWISS PROT Accession #Q02585).

[0097] III. Sequence of the Arabidopsis GT1209 Gene

[0098] The Arabidopsis GT1209 gene is identified by isolating DNA flanking the Ds transposon border from the tagged seedling-lethal line #GT1209. A region of the Arabidopsis DNA flanking the Ds transposon border corresponds to Arabidopsis genomic sequence (chromosome 1, clone F12K11, Genbank accession number AC007592). The inventors are the first to demonstrate that the GT1209 gene product is essential for normal growth and development in plants, as well as defining the function of the GT1209 gene through protein homology. The present invention discloses the cDNA coding nucleotide sequence of the Arabidopsis GT1209 gene as well as the amino acid sequence of the Arabidopsis GT1209 protein.

[0099] The present invention also encompasses an isolated amino acid sequence derived from a plant, wherein said amino acid sequence is identical or substantially similar to the amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO:3, wherein said amino acid sequence has GT1209 activity. Using BLAST and BLAST2 programs with the default settings, the sequence of the GT1209 gene shows similarity to a Mus musculus an anaphase-promoting complex subunit 5 (APC5)-like protein (GenPept Accession #BAA95076) as well as to an anaphase-promoting complex subunit 5 (APC5) protein from Homo sapiens (GenPept Accession #AAF05753 and Genbank Accession #AF191339).

[0100] IV. Sequence of the Arabidopsis GT1354 Gene

[0101] The Arabidopsis GT1354 gene is identified by isolating DNA flanking the Ds transposon border from the tagged seedling-lethal line #GT1354. A region of the Arabidopsis DNA flanking the Ds transposon border corresponds to Arabidopsis genomic sequence (Section 179 of 255 on chromosome 2, Genbank accession number AC006533). The inventors are the first to demonstrate that the GT1354 gene product is essential for normal growth and development in plants, as well as defining the function of the GT1354 gene through protein homology. The present invention discloses the cDNA coding nucleotide sequence of the Arabidopsis GT1354 gene as well as the amino acid sequence of the Arabidopsis GT1354 protein.

[0102] The present invention also encompasses an isolated amino acid sequence derived from a plant, wherein said amino acid sequence is identical or substantially similar to the amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO:5, wherein said amino acid sequence has GT1354 activity. Using BLAST and BLAST2 programs with the default settings, the sequence of the GT1354 gene shows similarity to a hypothetical protein from Arabidopsis thaliana (GenPept Accession #CAB81447).

[0103] V. Sequence of the Arabidopsis GT0946 Gene

[0104] The Arabidopsis GT0946 gene is identified by isolating DNA flanking the Ds transposon border from the tagged seedling-lethal line #GT0946. A region of the Arabidopsis DNA flanking the Ds transposon border corresponds to Arabidopsis genomic sequence (section 10 of 255 on chromosome 2, Genbank accession number AC004136). The inventors are the first to demonstrate that the GT0946 gene product is essential for normal growth and development in plants, as well as defining the function of the GT0946 gene through protein homology. The present invention discloses the cDNA coding nucleotide sequence of the Arabidopsis GT0946 gene as well as the amino acid sequence of the Arabidopsis GT0946 protein.

[0105] The present invention also encompasses an isolated amino acid sequence derived from a plant, wherein said amino acid sequence is identical or substantially similar to the amino acid sequence encoded by the nucleotide sequence set forth in SEQ ID NO:7, wherein said amino acid sequence has GT0946 activity. Using BLAST and BLAST2 programs with the default settings, the sequence of the GT0946 gene shows similarity to 4-Diphosphocytidyl-2C-methyl-D-erythritol synthase-like proteins from Bacillus subtilis (GenPept Accession #AAA21796), Haemophilus influenzae (GenPept Accession #AAC22332), Escherichia coli (GenPept Accession #AAF43207), Mycobacterium tuberculosis (GenPept Accession #CAB07156), Synechocystis sps (GenPept Accession #BAA18417), and Brassica napus (EST sequence, Genbank Accession #AI352824).

[0106] VI. Recombinant Production of GT1802, GT1209, GT1354, or GT0946 Activities and Uses Thereof

[0107] For recombinant production of GT1802, GT1209, GT1354, or GT0946 activities in a host organism, a nucleotide sequence encoding a protein having GT1802, GT1209, GT1354, or GT0946 activity, respectively, is inserted into an expression cassette designed for the chosen host and introduced into the host where it is recombinantly produced. For example, SEQ ID NO:1, nucleotide sequences substantially similar to SEQ ID NO:1, or homologs of the GT1802 gene are used for the recombinant production of a protein having GT1802 activity. For example, SEQ ID NO:3, nucleotide sequences substantially similar to SEQ ID NO:3, or homologs of the GT1209 gene are used for the recombinant production of a protein having GT1209 activity. For example, SEQ ID NO:5, nucleotide sequences substantially similar to SEQ ID NO:5, or homologs of the GT1354 gene are be used for the recombinant production of a protein having GT1354 activity. For example, SEQ ID NO:7, nucleotide sequences substantially similar to SEQ ID NO:7, or homologs of the GT0946 gene are be used for the recombinant production of a protein having GT0946 activity. The choice of specific regulatory sequences such as promoter, signal sequence, 5′ and 3′ untranslated sequences, and enhancer appropriate for the chosen host is within the level of skill of the routineer in the art. The resultant molecule, containing the individual elements operably linked in proper reading frame, may be inserted into a vector capable of being transformed into the host cell. Suitable expression vectors and methods for recombinant production of proteins are well known for host organisms such as E. coli, yeast, and insect cells (see, e.g., Luckow and Summers, Bio/Technol. 6: 47 (1988), and baculovirus expression vectors, e.g., those derived from the genome of Autographica californica nuclear polyhedrosis virus (AcMNPV). A preferred baculovirus/insect system is pAcHLT (Pharmingen, San Diego, Calif.) used to transfect Spodoptera frugiperda Sf9 cells (ATCC) in the presence of linear Autographa californica baculovirus DNA (Pharmingen, San Diego, Calif.). The resulting virus is used to infect HighFive Tricoplusia ni cells (Invitrogen, La Jolla, Calif.).

[0108] In a preferred embodiment, the nucleotide sequence encoding a protein having GT1802, GT1209, GT1354, or GT0946 activity is derived from an eukaryote, such as a mammal, a fly or a yeast, but is preferably derived from a plant, preferably a monocotyledonous or a dicotyledonous plant. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:1, or encodes a protein having GT1802 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:2. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:3, or encodes a protein having GT1209 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:4. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:5, or encodes a protein having GT1354 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:6. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:7, or encodes a protein having GT0946 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:8. In another preferred embodiment, the nucleotide sequence encoding a protein having GT1802, GT1209, GT1354, or GT0946 activity, respectively, is derived from a prokaryote. Recombinantly produced protein having GT1802, GT1209, GT1354, or GT0946 activity is isolated and purified using a variety of standard techniques. The actual techniques that may be used will vary depending upon the host organism used, whether the protein is designed for secretion, and other such factors familiar to the skilled artisan (see, e.g. chapter 16 of Ausubel, F. et al., “Current Protocols in Molecular Biology”, pub. by John Wiley & Sons, Inc. (1994).

[0109] Assays Utilizing the GT1802, GT1209, GT1354, or GT0946 Proteins

[0110] Recombinantly produced proteins having GT1802, GT1209, GT1354, or GT0946 activity are useful for a variety of purposes. For example, they can be used in in vitro assays to screen known herbicidal chemicals whose target has not been identified to determine if they inhibit GT1802, GT1209, GT1354, or GT0946, respectively. Such in vitro assays may also be used as more general screens to identify chemicals that inhibit such enzymatic activity and that are therefore novel herbicide candidates. Alternatively, recombinantly produced proteins having GT1802, GT1209, GT1354, or GT0946 activity may be used to elucidate the complex structure of these molecules and to further characterize their association with known inhibitors in order to rationally design new inhibitory herbicides as well as herbicide tolerant forms of the enzymes.

[0111] In Vitro Inhibitor Assays: Discovery of Small Molecule Ligand that Interacts with the Gene Product of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7

[0112] Once a protein has been identified as a potential herbicide target, the next step is to develop an assay that allows screening large number of chemicals to determine which ones interact with the protein. Although it is straightforward to develop assays for proteins of known function, developing assays with proteins of unknown functions is more difficult. This difficulty can be overcome by using technologies that can detect interactions between a protein and a compound without knowing the biological function of the protein. A short description of three methods is presented, including fluorescence correlation spectroscopy, surface-enhanced laser desorption/ionization, and biacore technologies.

[0113] Fluorescence Correlation Spectroscopy (FCS) theory was developed in 1972 but it is only in recent years that the technology to perform FCS became available (Madge et al. (1972) Phys. Rev. Lett., 29: 705-708; Maiti et al. (1997) Proc. Natl. Acad. Sci. USA, 94: 11753-11757). FCS measures the average diffusion rate of a fluorescent molecule within a small sample volume. The sample size can be as low as 10³ fluorescent molecules and the sample volume as low as the cytoplasm of a single bacterium. The diffusion rate is a function of the mass of the molecule and decreases as the mass increases. FCS can therefore be applied to protein-ligand interaction analysis by measuring the change in mass and therefore in diffusion rate of a molecule upon binding. In a typical experiment, the target to be analyzed is expressed as a recombinant protein with a sequence tag, such as a poly-histidine sequence, inserted at the N or C-terminus. The expression takes place in E. coli, yeast or insect cells. The protein is purified by chromatography. For example, the poly-histidine tag can be used to bind the expressed protein to a metal chelate column such as Ni2+ chelated on imninodiacetic acid agarose. The protein is then labeled with a fluorescent tag such as carboxytetramethylrhodamine or BODIPY® (Molecular Probes, Eugene, Oreg.). The protein is then exposed in solution to the potential ligand, and its diffusion rate is determined by FCS using instrumentation available from Carl Zeiss, Inc. (Thormwood, N.Y.). Ligand binding is determined by changes in the diffusion rate of the protein.

[0114] Surface-Enhanced Laser Desorption/Ionization (SELDI) was invented by Hutchens and Yip during the late 1980's (Hutchens and Yip (1993) Rapid Commun. Mass Spectrom. 7: 576-580). When coupled to a time-of-flight mass spectrometer (TOF), SELDI provides a mean to rapidly analyze molecules retained on a chip. It can be applied to ligand-protein interaction analysis by covalently binding the target protein on the chip and analyze by MS the small molecules that bind to this protein (Worrall et al. (1998) Anal. Biochem. 70: 750-756). In a typical experiment, the target to be analyzed is expressed as described for FCS. The purified protein is then used in the assay without further preparation. It is bound to the SELDI chip either by utilizing the poly-histidine tag or by other interaction such as ion exchange or hydrophobic interaction. The chip thus prepared is then exposed to the potential ligand via, for example, a delivery system capable to pipet the ligands in a sequential manner (autosampler). The chip is then submitted to washes of increasing stringency, for example a series of washes with buffer solutions containing an increasing ionic strength. After each wash, the bound material is analyzed by submitting the chip to SELDI-TOF. Ligands that specifically bind the target will be identified by the stringency of the wash needed to elute them.

[0115] Biacore relies on changes in the refractive index at the surface layer upon binding of a ligand to a protein immobilized on the layer. In this system, a collection of small ligands is injected sequentially in a 2-5 microlitre cell with the immobilized protein. Binding is detected by surface plasmon resonance (SPR) by recording laser light refracting from the surface. In general, the refractive index change for a given change of mass concentration at the surface layer, is practically the same for all proteins and peptides, allowing a single method to be applicable for any protein (Liedberg et al. (1983) Sensors Actuators 4: 299-304; Malmquist (1993) Nature, 361: 186-187). In a typical experiment, the target to be analyzed is expressed as described for FCS. The purified protein is then used in the assay without further preparation. It is bound to the Biacore chip either by utilizing the poly-histidine tag or by other interaction such as ion exchange or hydrophobic interaction. The chip thus prepared is then exposed to the potential ligand via the delivery system incorporated in the instruments sold by Biacore (Uppsala, Sweden) to pipet the ligands in a sequential manner (autosampler). The SPR signal on the chip is recorded and changes in the refractive index indicate an interaction between the immobilized target and the ligand. Analysis of the signal kinetics on rate and off rate allows the discrimination between non-specific and specific interaction.

[0116] Also, an assay for small molecule ligands that interact with a polypeptide is an inhibitor assay. For example, such an inhibitor assay useful for identifying inhibitors of the products essential plant genes, such as GT1802, GT1209, GT1354, or GT0946 genes, comprises the steps of: a) reacting an GT1802, GT1209, GT1354, or GT0946 protein, respectively, and a substrate thereof in the presence of a suspected inhibitor of the protein's respective function; b) comparing the rate of enzymatic activity of the protein in the presence of the suspected inhibitor to the rate of enzymatic activity under the same conditions in the absence of the suspected inhibitor; and c) determining whether the suspected inhibitor inhibits the GT1802, GT1209, GT1354, or GT0946 protein, respectively.

[0117] For example, the inhibitory effect on GT1802, GT1209, GT1354, or GT0946 activity, may be determined by a reduction or complete inhibition of GT1802, GT1209, GT1354, or GT0946 activity, respectively, in the assay. Such a determination may be made by comparing, in the presence and absence of the candidate inhibitor, the amount of substrate used or intermediate or product made during the reaction.

[0118] VII. In Vivo Inhibitor Assay

[0119] In one embodiment, a suspected herbicide, for example identified by in vitro screening, is applied to plants at various concentrations. The suspected herbicide is preferably sprayed on the plants. After application of the suspected herbicide, its effect on the plants, for example death or suppression of growth is recorded.

[0120] In another embodiment, an in vivo screening assay for inhibitors of the GT1802, GT1209, GT1354, or GT0946 activity uses transgenic plants, plant tissue, plant seeds or plant cells capable of overexpressing a nucleotide sequence having GT1802, GT1209, GT1354, or GT0946 activity, respectively, wherein the GT1802, GT1209, GT1354, or GT0946 gene product is enzymatically active in the transgenic plants, plant tissue, plant seeds or plant cells. The nucleotide sequence is preferably derived from an eukaryote, such as a yeast, but is preferably derived from a plant. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:1, or encodes an enzyme having GT1802 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:2. In another preferred embodiment, the nucleotide sequence is derived from a prokaryote. In a further embodiment, the nucleotide sequence is derived from an eukaryote, such as a yeast, but is preferably derived from a plant. In a preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:3, or encodes an enzyme having GT1209 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:4. In another preferred embodiment, the nucleotide sequence is derived from a prokaryote. In a further preferred embodiment, the nucleotide sequence is derived from an eukaryote, such as a yeast, but is preferably derived from a plant. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:5, or encodes an enzyme having GT1354 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:6. In another preferred embodiment, the nucleotide sequence is derived from a prokaryote. In a further preferred embodiment, the nucleotide sequence is derived from an eukaryote, such as a yeast, but is preferably derived from a plant. In a further preferred embodiment, the nucleotide sequence is identical or substantially similar to the nucleotide sequence set forth in SEQ ID NO:7, or encodes an enzyme having GT0946 activity, whose amino acid sequence is identical or substantially similar to the amino acid sequence set forth in SEQ ID NO:8. In another preferred embodiment, the nucleotide sequence is derived from a prokaryote.

[0121] A chemical is then applied to the transgenic plants, plant tissue, plant seeds or plant cells and to the isogenic non-transgenic plants, plant tissue, plant seeds or plant cells, and the growth or viability of the transgenic and non-transformed plants, plant tissue, plant seeds or plant cells are determined after application of the chemical and compared. Compounds capable of inhibiting the growth of the non-transgenic plants, but not affecting the growth of the transgenic plants are selected as specific inhibitors of GT1802, GT1209, GT1354, or GT0946 activity.

[0122] VIII. Herbicide Tolerant Plants

[0123] The present invention is further directed to plants, plant tissue, plant seeds, and plant cells tolerant to herbicides that inhibit the naturally occurring GT1802, GT1209, GT1354, or GT0946 activity in these plants, wherein the tolerance is conferred by an altered GT1802, GT1209, GT1354, or GT0946 activity, respectively. Altered GT1802, GT1209, GT1354, or GT0946 activity may be conferred upon a plant according to the invention by increasing expression of wild-type herbicide-sensitive GT1802, GT1209, GT1354, or GT0946 gene, respectively, for example by providing additional wild-type GT1802, GT1209, GT1354, or GT0946 genes and/or by overexpressing the endogenous GT1802, GT1209, GT1354, or GT0946 gene, for example by driving expression with a strong promoter. Altered GT1802, GT1209, GT1354, or GT0946 activity also may be accomplished by expressing nucleotide sequences that are substantially similar to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, or homologs in a plant. Still further altered GT1802, GT1209, GT1354, or GT0946 activity is conferred on a plant by expressing modified herbicide-tolerant GT1802, GT1209, GT1354, or GT0946 genes, respectively, in the plant. Combinations of these techniques may also be used. Representative plants include any plants to which these herbicides are applied for their normally intended purpose. Preferred are agronomically important crops such as cotton, soybean, oilseed rape, sugar beet, maize, rice, wheat, barley, oats, rye, sorghum, millet, turf, forage, turf grasses, and the like.

[0124] A. Increased Expression of Wild-Type GT1802, GT1209, GT1354, or GT0946

[0125] Achieving altered GT1802, GT1209, GT1354, or GT0946 activity through increased expression results in a level of GT1802, GT1209, GT1354, or GT0946 activity, respectively, in the plant cell at least sufficient to overcome growth inhibition caused by the herbicide when applied in amounts sufficient to inhibit normal growth of control plants. The level of expressed enzyme generally is at least two times, preferably at least five times, and more preferably at least ten times the natively expressed amount. Increased expression may be due to multiple copies of a wild-type GT1802, GT1209, GT1354, or GT0946 gene; multiple occurrences of the coding sequence within the gene (i.e. gene amplification) or a mutation in the non-coding, regulatory sequence of the endogenous gene in the plant cell. Plants having such altered gene activity can be obtained by direct selection in plants by methods known in the art (see, e.g. U.S. Pat. Nos. 5,162,602, and 4,761,373, and references cited therein). These plants also may be obtained by genetic engineering techniques known in the art. Increased expression of a herbicide-sensitive GT1802, GT1209, GT1354, or GT0946 gene can also be accomplished by transforming a plant cell with a recombinant or chimeric DNA molecule comprising a promoter capable of driving expression of an associated structural gene in a plant cell operatively linked to a homologous or heterologous structural gene encoding the GT1802, GT1209, GT1354, or GT0946 protein, respectively, or a homolog thereof. Preferably, the transformation is stable, thereby providing a heritable transgenic trait.

[0126] B. Expression of Modified Herbicide-Tolerant GT1802, GT1209, GT1354, or GT0946 Proteins

[0127] According to this embodiment, plants, plant tissue, plant seeds, or plant cells are stably transformed with a recombinant DNA molecule comprising a suitable promoter functional in plants operatively linked to a coding sequence encoding a herbicide tolerant form of the GT1802, GT1209, GT1354, or GT0946 protein. A herbicide tolerant form of the enzyme has at least one amino acid substitution, addition or deletion that confers tolerance to a herbicide that inhibits the unmodified, naturally occurring form of the enzyme. The transgenic plants, plant tissue, plant seeds, or plant cells thus created are then selected by conventional selection techniques, whereby herbicide tolerant lines are isolated, characterized, and developed. Below are described methods for obtaining genes that encode herbicide tolerant forms of GT1802, GT1209, GT1354, or GT0946 protein.

[0128] One general strategy involves direct or indirect mutagenesis procedures on microbes. For instance, a genetically manipulatable microbe such as E. coli or S. cerevisiae may be subjected to random mutagenesis in vivo with mutagens such as UV light or ethyl or methyl methane sulfonate. Mutagenesis procedures are described, for example, in Miller, Experiments in Molecular Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1972); Davis et al., Advanced Bacterial Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1980); Sherman et al., Methods in Yeast Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1983); and U.S. Pat. No. 4,975,374. For example, the microbe selected for mutagenesis contains a normal, inhibitor-sensitive GT1802, GT1209, GT1354, or GT0946 gene, or nucleotide sequence substantially similar thereto, which encodes a protein having GT1802, GT1209, GT1354, or GT0946 gene product activity, and is dependent upon the activity conferred by this gene for growth. The mutagenized cells are grown in the presence of the inhibitor at concentrations that inhibit the unmodified gene. Colonies of the mutagenized microbe that grow better than the unmutagenized microbe in the presence of the inhibitor (i.e. exhibit resistance to the inhibitor) are selected for further analysis. GT1802, GT1209, GT1354, or GT0946 genes conferring tolerance to the inhibitor are isolated from these colonies, either by cloning or by PCR amplification, and their sequences are elucidated. Sequences encoding altered gene products are then cloned back into the microbe to confirm their ability to confer inhibitor tolerance.

[0129] A method of obtaining mutant herbicide-tolerant alleles of a plant GT1802, GT1209, GT1354, or GT0946 gene involves direct selection in plants. For example, the effect of a mutagenized GT1802, GT1209, GT1354, or GT0946 gene on the growth inhibition of plants such as Arabidopsis, soybean, or maize is determined by plating seeds sterilized by art-recognized methods on plates on a simple minimal salts medium containing increasing concentrations of the inhibitor. Such concentrations are in the range of 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10, 30, 110, 300, 1000 and 3000 parts per million (ppm). The lowest dose at which significant growth inhibition can be reproducibly detected is used for subsequent experiments. Determination of the lowest dose is routine in the art.

[0130] Mutagenesis of plant material is utilized to increase the frequency at which resistant alleles occur in the selected population. Mutagenized seed material is derived from a variety of sources, including chemical or physical mutagenesis or seeds, or chemical or physical mutagenesis or pollen (Neuffer, In Maize for Biological Research Sheridan, ed. Univ. Press, Grand Forks, N.Dak., pp. 61-64 (1982)), which is then used to fertilize plants and the resulting M₁ mutant seeds collected. Typically for Arabidopsis, M₂ seeds (Lehle Seeds, Tucson, Ariz.), which are progeny seeds of plants grown from seeds mutagenized with chemicals, such as ethyl methane sulfonate, or with physical agents, such as gamma rays or fast neutrons, are plated at densities of up to 10,000 seeds/plate (10 cm diameter) on minimal salts medium containing an appropriate concentration of inhibitor to select for tolerance. Seedlings that continue to grow and remain green 7-21 days after plating are transplanted to soil and grown to maturity and seed set. Progeny of these seeds are tested for tolerance to the herbicide. If the tolerance trait is dominant, plants whose seed segregate 3:1 / resistant:sensitive are presumed to have been heterozygous for the resistance at the M₂ generation. Plants that give rise to all resistant seed are presumed to have been homozygous for the resistance at the M₂ generation. Such mutagenesis on intact seeds and screening of their M2 progeny seed can also be carried out on other species, for instance soybean (see, e.g. U.S. Pat. No. 5,084,082). Alternatively, mutant seeds to be screened for herbicide tolerance are obtained as a result of fertilization with pollen mutagenized by chemical or physical means.

[0131] Confirmation that the genetic basis of the herbicide tolerance is a GT1802, GT1209, GT1354, or GT0946 gene is ascertained as exemplified below. First, alleles of the GT1802, GT1209, GT1354, or GT0946 gene from plants exhibiting resistance to the inhibitor are isolated using PCR with primers based either upon the Arabidopsis cDNA coding sequences shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7, respectively, or, more preferably, based upon the unaltered GT1802, GT1209, GT1354, or GT0946 gene sequence from the plant used to generate tolerant alleles. After sequencing the alleles to determine the presence of mutations in the coding sequence, the alleles are tested for their ability to confer tolerance to the inhibitor on plants into which the putative tolerance-conferring alleles have been transformed. These plants can be either Arabidopsis plants or any other plant whose growth is susceptible to the GT1802, GT1209, GT1354, or GT0946 inhibitors. Second, the inserted GT1802, GT1209, GT1354, or GT0946 genes are mapped relative to known restriction fragment length polymorphisms (RFLPs) (See, for example, Chang et al. Proc. Natl. Acad, Sci, USA 85: 6856-6860 (1988); Nam et al., Plant Cell 1: 699-705 (1989), cleaved amplified polymorphic sequences (CAPS) (Konieczny and Ausubel (1993) The Plant Journal, 4(2): 403-410), or SSLPs (Bell and Ecker (1994) Genomics, 19: 137-144). The GT1802, GT1209, GT1354, or GT0946 inhibitor tolerance trait is independently mapped using the same markers. When tolerance is due to a mutation in that GT1802, GT1209, GT1354, or GT0946 gene, the tolerance trait maps to a position indistinguishable from the position of the GT1802, GT1209, GT1354, or GT0946 gene.

[0132] Another method of obtaining herbicide-tolerant alleles of a GT1802, GT1209, GT1354, or GT0946 gene is by selection in plant cell cultures. Explants of plant tissue, e.g. embryos, leaf disks, etc. or actively growing callus or suspension cultures of a plant of interest are grown on medium in the presence of increasing concentrations of the inhibitory herbicide or an analogous inhibitor suitable for use in a laboratory environment. Varying degrees of growth are recorded in different cultures. In certain cultures, fast-growing variant colonies arise that continue to grow even in the presence of normally inhibitory concentrations of inhibitor. The frequency with which such faster-growing variants occur can be increased by treatment with a chemical or physical mutagen before exposing the tissues or cells to the inhibitor. Putative tolerance-conferring alleles of the GT1802, GT1209, GT1354, or GT0946 gene are isolated and tested as described in the foregoing paragraphs. Those alleles identified as conferring herbicide tolerance may then be engineered for optimal expression and transformed into the plant. Alternatively, plants can be regenerated from the tissue or cell cultures containing these alleles.

[0133] Still another method involves mutagenesis of wild-type, herbicide sensitive plant GT1802, GT1209, GT1354, or GT0946 genes in genetically manipulatable microbes, followed by culturing the microbe on medium that contains inhibitory concentrations (i.e. sufficient to cause abnormal growth, inhibit growth or cause cell death) of the inhibitor, and then selecting those colonies that grow normally in the presence of the inhibitor. More specifically, a plant cDNA, such as the Arabidopsis cDNA encoding the GT1802, GT1209, GT1354, or GT0946 protein, is cloned into a microbe that is dependent on GT1802, GT1209, GT1354, or GT0946 gene product activity, respectively, for growth, or that otherwise lacks the GT1802, GT1209, GT1354, or GT0946 activity. The transformed microbe is then subjected to in vivo mutagenesis or to in vitro mutagenesis by any of several chemical or enzymatic methods known in the art, e.g. sodium bisulfite (Shortle et al., Methods Enzymol. 100:457-468 (1983); methoxylamine (Kadonaga et al., Nucleic Acids Res. 13:1733-1745 (1985); oligonucleotide-directed saturation mutagenesis (Hutchinson et al., Proc. Natl. Acad. Sci. USA, 83:710-714 (1986); or various polymerase misincorporation strategies (see, e.g. Shortle et al., Proc. Natl. Acad. Sci. USA, 79:1588-1592 (1982); Shiraishi et al., Gene 64:313-319 (1988); and Leung et al., Technique 1:11-15 (1989). Colonies that grow normally in the presence of normally inhibitory concentrations of inhibitor are picked and purified by repeated restreaking. Their plasmids are purified and tested for the ability to confer tolerance to the inhibitor by retransforming them into the microbe lacking GT1802, GT1209, GT1354, or GT0946 activity, respectively. The DNA sequences of cDNA inserts from plasmids that pass this test are then determined.

[0134] Herbicide resistant GT1802, GT1209, GT1354, or GT0946 proteins are also obtained using methods involving in vitro recombination, also called DNA shuffling. By DNA shuffling, mutations, preferably random mutations, are introduced into nucleotide sequences encoding GT1802, GT1209, GT1354, or GT0946 activity. DNA shuffling also leads to the recombination and rearrangement of sequences within a GT1802, GT1209, GT1354, or GT0946 gene or to recombination and exchange of sequences between two or more different of GT1802, GT1209, GT1354, or GT0946 genes. These methods allow for the production of millions of mutated GT1802, GT1209, GT1354, or GT0946 coding sequences. The mutated genes, or shuffled genes, are screened for desirable properties, e.g. improved tolerance to herbicides and for mutations that provide broad spectrum tolerance to the different classes of inhibitor chemistry. Such screens are well within the skills of a routineer in the art.

[0135] In a preferred embodiment, a mutagenized GT1802, GT1209, GT1354, or GT0946 gene is formed from at least one template GT1802, GT1209, GT1354, or GT0946 gene, wherein the template GT1802, GT1209, GT1354, or GT0946 gene has been cleaved into double-stranded random fragments of a desired size, and comprising the steps of adding to the resultant population of double-stranded random fragments one or more single or double-stranded oligonucleotides, wherein said oligonucleotides comprise an area of identity and an area of heterology to the double-stranded random fragments; denaturing the resultant mixture of double-stranded random fragments and oligonucleotides into single-stranded fragments; incubating the resultant population of single-stranded fragments with a polymerase under conditions which result in the annealing of said single-stranded fragments at said areas of identity to form pairs of annealed fragments, said areas of identity being sufficient for one member of a pair to prime replication of the other, thereby forming a mutagenized double-stranded polynucleotide; and repeating the second and third steps for at least two further cycles, wherein the resultant mixture in the second step of a further cycle includes the mutagenized double-stranded polynucleotide from the third step of the previous cycle, and the further cycle forms a further mutagenized double-stranded polynucleotide, wherein the mutagenized polynucleotide is a mutated GT1802, GT1209, GT1354, or GT0946 gene having enhanced tolerance to a herbicide which inhibits naturally occurring GT1802, GT1209, GT1354, or GT0946 activity, respectively. In a preferred embodiment, the concentration of a single species of double-stranded random fragment in the population of double-stranded random fragments is less than 1% by weight of the total DNA. In a further preferred embodiment, the template double-stranded polynucleotide comprises at least about 100 species of polynucleotides. In another preferred embodiment, the size of the double-stranded random fragments is from about 5 bp to 5 kb. In a further preferred embodiment, the fourth step of the method comprises repeating the second and the third steps for at least 10 cycles. Such method is described e.g. in Stemmer et al. (1994) Nature 370: 389-391, in U.S. Pat. Nos. 5,605,793, 5,811,238 and in Crameri et al. (1998) Nature 391: 288-291, as well as in WO 97/20078, and these references are incorporated herein by reference.

[0136] In another preferred embodiment, any combination of two or more different GT1802, GT1209, GT1354, or GT0946 genes are mutagenized in vitro by a staggered extension process (StEP), as described e.g. in Zhao et al. (1998) Nature Biotechnology 16: 258-261. The two or more GT1802, GT1209, GT1354, or GT0946 genes are used as template for PCR amplification with the extension cycles of the PCR reaction preferably carried out at a lower temperature than the optimal polymerization temperature of the polymerase. For example, when a thermostable polymerase with an optimal temperature of approximately 72° C. is used, the temperature for the extension reaction is desirably below 72° C., more desirably below 65° C., preferably below 60° C., more preferably the temperature for the extension reaction is 55° C. Additionally, the duration of the extension reaction of the PCR cycles is desirably shorter than usually carried out in the art, more desirably it is less than 30 seconds, preferably it is less than 15 seconds, more preferably the duration of the extension reaction is 5 seconds. Only a short DNA fragment is polymerized in each extension reaction, allowing template switch of the extension products between the starting DNA molecules after each cycle of denaturation and annealing, thereby generating diversity among the extension products. The optimal number of cycles in the PCR reaction depends on the length of the GT1802, GT1209, GT1354, or GT0946 genes to be mutagenized but desirably over 40 cycles, more desirably over 60 cycles, preferably over 80 cycles are used. Optimal extension conditions and the optimal number of PCR cycles for every combination of GT1802, GT1209, GT1354, or GT0946 genes are determined as described in using procedures well-known in the art. The other parameters for the PCR reaction are essentially the same as commonly used in the art. The primers for the amplification reaction are preferably designed to anneal to DNA sequences located outside of the GT1802, GT1209, GT1354, or GT0946 genes, e.g. to DNA sequences of a vector comprising the GT1802, GT1209, GT1354, or GT0946 genes, whereby the different GT1802, GT1209, GT1354, or GT0946 genes used in the PCR reaction are preferably comprised in separate vectors. The primers desirably anneal to sequences located less than 500 bp away from GT1802, GT1209, GT1354, or GT0946 sequences, preferably less than 200 bp away from the GT1802, GT1209, GT1354 , or GT 0946 sequences, more preferably less than 120 bp away from the GT1802 , GT1209 , GT1354 , or GT0946 sequences. Preferably, the GT1802, GT1209, GT1354, or GT0946 sequences are surrounded by restriction sites, which are included in the DNA sequence amplified during the PCR reaction, thereby facilitating the cloning of the amplified products into a suitable vector. In another preferred embodiment, fragments of GT1802, GT1209, GT1354, or GT0946 genes having cohesive ends are produced as described in WO 98/05765. The cohesive ends are produced by ligating a first oligonucleotide corresponding to a part of a GT1802, GT1209, GT1354, or GT0946 gene to a second oligonucleotide not present in the gene or corresponding to a part of the gene not adjoining to the part of the gene corresponding to the first oligonucleotide, wherein the second oligonucleotide contains at least one ribonucleotide. A double-stranded DNA is produced using the first oligonucleotide as template and the second oligonucleotide as primer. The ribonucleotide is cleaved and removed. The nucleotide(s) located 5′ to the ribonucleotide is also removed, resulting in double-stranded fragments having cohesive ends. Such fragments are randomly reassembled by ligation to obtain novel combinations of gene sequences.

[0137] Any GT1802, GT1209, GT1354, or GT0946 gene or any combination of GT1802, GT1209, GT1354, or GT0946 genes, or homologs thereof, is used for in vitro recombination in the context of the present invention, for example, a GT1802, GT1209, GT1354, or GT0946 gene derived from a plant, such as, e.g. Arabidopsis thaliana, e.g. a GT1802 gene set forth in SEQ ID NO:1, a GT1209 gene set forth in SEQ ID NO:3, a GT1354 gene set forth in SEQ ID NO:5, and a GT0946 gene set forth in SEQ ID NO:7. Whole GT1802, GT1209, GT1354, or GT0946 genes or portions thereof are used in the context of the present invention. The library of mutated GT1802, GT1209, GT1354, or GT0946 genes obtained by the methods described above are cloned into appropriate expression vectors and the resulting vectors are transformed into an appropriate host, for example a plant cell, an algae like Chiamydomonas, a yeast or a bacteria. An appropriate host requires GT1802, GT1209, GT1354, or GT0946 gene product activity for growth. Host cells transformed with the vectors comprising the library of mutated GT1802, GT1209, GT1354, or GT0946 genes are cultured on medium that contains inhibitory concentrations of the inhibitor and those colonies that grow in the presence of the inhibitor are selected. Colonies that grow in the presence of normally inhibitory concentrations of inhibitor are picked and purified by repeated restreaking. Their plasmids are purified and the DNA sequences of cDNA inserts from plasmids that pass this test are then determined.

[0138] An assay for identifying a modified GT1802, GT1209, GT1354, or GT0946 gene that is tolerant to an inhibitor may be performed in the same manner as the assay to identify inhibitors of the GT1802, GT1209, GT1354, or GT0946 activity (Inhibitor Assay, above) with the following modifications: First, a mutant GT1802, GT1209, GT1354, or GT0946 protein is substituted in one of the reaction mixtures for the wild-type GT1802, GT1209, GT1354, or GT0946 protein of the inhibitor assay. Second, an inhibitor of wild-type enzyme is present in both reaction mixtures. Third, mutated activity (activity in the presence of inhibitor and mutated enzyme) and unmutated activity (activity in the presence of inhibitor and wild-type enzyme) are compared to determine whether a significant increase in enzymatic activity is observed in the mutated activity when compared to the unmutated activity. Mutated activity is any measure of activity of the mutated enzyme while in the presence of a suitable substrate and the inhibitor. Unmutated activity is any measure of activity of the wild-type enzyme while in the presence of a suitable substrate and the inhibitor.

[0139] In addition to being used to create herbicide-tolerant plants, genes encoding herbicide-tolerant GT1802, GT1209, GT1354, or GT0946 protein can also be used as selectable markers in plant cell transformation methods. For example, plants, plant tissue, plant seeds, or plant cells transformed with a heterologous DNA sequence can also be transformed with a sequence encoding an altered GT1802, GT1209, GT1354, or GT0946 activity capable of being expressed by the plant. The transformed cells are transferred to medium containing an inhibitor of the enzyme in an amount sufficient to inhibit the growth or survivability of plant cells not expressing the modified coding sequence, wherein only the transformed cells will grow. The method is applicable to any plant cell capable of being transformed with a modified GT1802, GT1209, GT1354, or GT0946 gene, and can be used with any heterologous DNA sequence of interest. Expression of the heterologous DNA sequence and the modified gene can be driven by the same promoter functional in plant cells, or by separate promoters.

[0140] IX. Plant Transformation Technology

[0141] A wild type or herbicide-tolerant form of the GT1802, GT1209, GT1354, or GT0946 gene, or homologs thereof, can be incorporated in plant or bacterial cells using conventional recombinant DNA technology. Generally, this involves inserting a DNA molecule encoding the GT1802, GT1209, GT1354, or GT0946 gene into an expression system to which the DNA molecule is heterologous (i.e., not normally present) using standard cloning procedures known in the art. The vector contains the necessary elements for the transcription and translation of the inserted protein-coding sequences in a host cell containing the vector. A large number of vector systems known in the art can be used, such as plasmids, bacteriophage viruses and other modified viruses. The components of the expression system may also be modified to increase expression. For example, truncated sequences, nucleotide substitutions, nucleotide optimization or other modifications may be employed. Expression systems known in the art can be used to transform virtually any crop plant cell under suitable conditions. A heterologous DNA sequence comprising a wild-type or herbicide-tolerant form of the GT1802, GT1209, GT1354, or GT0946 gene is preferably stably transformed and integrated into the genome of the host cells. In another preferred embodiment, the heterologous DNA sequence comprising a wild-type or herbicide-tolerant form of the GT1802, GT1209, GT1354, or GT0946 gene located on a self-replicating vector. Examples of self-replicating vectors are viruses, in particular gemini viruses. Transformed cells can be regenerated into whole plants such that the chosen form of the GT1802, GT1209, GT1354, or GT0946 gene confers herbicide tolerance in the transgenic plants.

[0142] A. Requirements for Construction of Plant Expression Cassettes

[0143] Gene sequences intended for expression in transgenic plants are first assembled in expression cassettes behind a suitable promoter expressible in plants. The expression cassettes may also comprise any further sequences required or selected for the expression of the heterologous DNA sequence. Such sequences include, but are not restricted to, transcription terminators, extraneous sequences to enhance expression such as introns, vital sequences, and sequences intended for the targeting of the gene product to specific organelles and cell compartments. These expression cassettes can then be easily transferred to the plant transformation vectors described infra. The following is a description of various components of typical expression cassettes.

[0144] 1. Promoters

[0145] The selection of the promoter used in expression cassettes will determine the spatial and temporal expression pattern of the heterologous DNA sequence in the plant transformed with this DNA sequence. Selected promoters will express heterologous DNA sequences in specific cell types (such as leaf epidermal cells, mesophyll cells, root cortex cells) or in specific tissues or organs (roots, leaves or flowers, for example) and the selection will reflect the desired location of accumulation of the gene product. Alternatively, the selected promoter may drive expression of the gene under various inducing conditions. Promoters vary in their strength, i.e., ability to promote transcription. Depending upon the host cell system utilized, any one of a number of suitable promoters known in the art can be used. For example, for constitutive expression, the CaMV 35S promoter, the rice actin promoter, or the ubiquitin promoter may be used. For regulatable expression, the chemically inducible PR-1 promoter from tobacco or Arabidopsis may be used (see, e.g., U.S. Pat. No. 5,689,044).

[0146] 2. Transcriptional Terminators

[0147] A variety of transcriptional terminators are available for use in expression cassettes. These are responsible for the termination of transcription beyond the heterologous DNA sequence and its correct polyadenylation. Appropriate transcriptional terminators are those that are known to function in plants and include the CaMV 35S terminator, the tml terminator, the nopaline synthase terminator and the pea rbcS E9 terminator. These can be used in both monocotyledonous and dicotyledonous plants.

[0148] 3. Sequences for the Enhancement or Regulation of Expression

[0149] Numerous sequences have been found to enhance gene expression from within the transcriptional unit and these sequences can be used in conjunction with the genes of this invention to increase their expression in transgenic plants. For example, various intron sequences such as introns of the maize AdhI gene have been shown to enhance expression, particularly in monocotyledonous cells. In addition, a number of non-translated leader sequences derived from viruses are also known to enhance expression, and these are particularly effective in dicotyledonous cells.

[0150] 4. Coding Sequence Optimization

[0151] The coding sequence of the selected gene optionally is genetically engineered by altering the coding sequence for optimal expression in the crop species of interest. Methods for modifying coding sequences to achieve optimal expression in a particular crop species are well known (see, e.g. Perlak et al., Proc. Natl. Acad. Sci. USA 88: 3324 (1991); and Koziel et al., Bio/technol. 11: 194 (1993); Fennoy and Bailey-Serres. Null. Acids Res. 21: 5294-5300 (1993). Methods for modifying coding sequences by taking into account codon usage in plant genes and in higher plants, green algae, and cyanobacteria are well known (see table 4 in: Murray et al. Null. Acids Res. 17: 477-498 (1989); Campbell and Gowri Plant Physiol. 92: 1-11(1990).

[0152] 5. Targeting of the Gene Product Within the Cell

[0153] Various mechanisms for targeting gene products are known to exist in plants and the sequences controlling the functioning of these mechanisms have been characterized in some detail. For example, the targeting of gene products to the chloroplast is controlled by a signal sequence found at the amino terminal end of various proteins which is cleaved during chloroplast import to yield the mature protein (e.g. Comai et al. J. Biol. Chem. 263: 15104-15109 (1988)). Other gene products are localized to other organelles such as the mitochondrion and the peroxisome (e.g. Unger et al. Plant Molec. Biol. 13: 411-418 (1989)). The cDNAs encoding these products can also be manipulated to effect the targeting of heterologous products encoded by DNA sequences to these organelles. In addition, sequences have been characterized which cause the targeting of products encoded by DNA sequences to other cell compartments. Amino terminal sequences are responsible for targeting to the ER, the apoplast, and extracellular secretion from aleurone cells (Koehler & Ho, Plant Cell 2: 769-783 (1990)). Additionally, amino terminal sequences in conjunction with carboxy terminal sequences are responsible for vacuolar targeting of gene products (Shinshi et al. Plant Molec. Biol. 14: 357-368 (1990)). By the fusion of the appropriate targeting sequences described above to heterologous DNA sequences of interest it is possible to direct this product to any organelle or cell compartment.

[0154] B. Construction of Plant Transformation Vectors

[0155] Numerous transformation vectors available for plant transformation are known to those of ordinary skill in the plant transformation arts, and the genes pertinent to this invention can be used in conjunction with any such vectors. The selection of vector will depend upon the preferred transformation technique and the target species for transformation. For certain target species, different antibiotic or herbicide selection markers may be preferred. Selection markers used routinely in transformation include the nptII gene, which confers resistance to kanamycin and related antibiotics (Messing & Vierra. Gene 19: 259-268 (1982); Bevan et al., Nature 304:184-187 (1983)), the bar gene, which confers resistance to the herbicide phosphinothricin (White et al., Null. Acids Res 18: 1062 (1990), Spencer et al. Theor. Appl. Genet 79: 625-631 (1990)), the hph gene, which confers resistance to the antibiotic hygromycin (Blochinger & Diggelmann, Mol Cell Biol 4: 2929-2931), the manA gene, which allows for positive selection in the presence of mannose (Miles and Guest (1984) Gene, 32:41-48; U.S. Pat. No. 5,767,378), the dhfr gene, which confers resistance to methotrexate (Bourouis et al., EMBO J. 2(7): 1099-1104 (1983)), and the EPSPS gene, which confers resistance to glyphosate (U.S. Pat. Nos. 4,940,935 and 5,188,642).

[0156] 1. Vectors Suitable for Agrobacterium Transformation

[0157] Many vectors are available for transformation using Agrobacterium tumefaciens. These typically carry at least one T-DNA border sequence and include vectors such as pBIN19 (Bevan, Null. Acids Res. (1984)). Typical vectors suitable for Agrobacterium transformation include the binary vectors pCIB200 and pCIB2001, as well as the binary vector pCIB10 and hygromycin selection derivatives thereof. (See, for example, U.S. Pat. No. 5,639,949).

[0158] 2. Vectors Suitable for non-Agrobacterium Transformation

[0159] Transformation without the use of Agrobacterium tumefaciens circumvents the requirement for T-DNA sequences in the chosen transformation vector and consequently vectors lacking these sequences can be utilized in addition to vectors such as the ones described above which contain T-DNA sequences. Transformation techniques that do not rely on Agrobacterium include transformation via particle bombardment, protoplast uptake (e.g. PEG and electroporation) and microinjection. The choice of vector depends largely on the preferred selection for the species being transformed. Typical vectors suitable for non-Agrobacterium transformation include pCIB3064, pSOG19, and pSOG35. (See, for example, U.S. Pat. No. 5,639,949).

[0160] C. Transformation Techniques

[0161] Once the coding sequence of interest has been cloned into an expression system, it is transformed into a plant cell. Methods for transformation and regeneration of plants are well known in the art. For example, Ti plasmid vectors have been utilized for the delivery of foreign DNA, as well as direct DNA uptake, liposomes, electroporation, micro-injection, and microprojectiles. In addition, bacteria from the genus Agrobacterium can be utilized to transform plant cells.

[0162] Transformation techniques for dicotyledons are well known in the art and include Agrobacterium-based techniques and techniques that do not require Agrobacterium. Non-Agrobacterium techniques involve the uptake of exogenous genetic material directly by protoplasts or cells. This can be accomplished by PEG or electroporation mediated uptake, particle bombardment-mediated delivery, or microinjection. In each case the transformed cells are regenerated to whole plants using standard techniques known in the art.

[0163] Transformation of most monocotyledon species has now also become routine. Preferred techniques include direct gene transfer into protoplasts using PEG or electroporation techniques, particle bombardment into callus tissue, as well as Agrobacterium-mediated transformation.

[0164] D. Plastid Transformation

[0165] In another preferred embodiment, a nucleotide sequence encoding a polypeptide having GT1802, GT1209, GT1354, or GT0946 activity is directly transformed into the plastid genome. Plastid expression, in which genes are inserted by homologous recombination into the several thousand copies of the circular plastid genome present in each plant cell, takes advantage of the enormous copy number advantage over nuclear-expressed genes to permit expression levels that can readily exceed 10% of the total soluble plant protein. In a preferred embodiment, the nucleotide sequence is inserted into a plastid targeting vector and transformed into the plastid genome of a desired plant host. Plants homoplasmic for plastid genomes containing the nucleotide sequence are obtained, and are preferentially capable of high expression of the nucleotide sequence.

[0166] Plastid transformation technology is for example extensively described in U.S. Pat. Nos. 5,451,513, 5,545,817, 5,545,818, and 5,877,462 in PCT application no. WO 95/16783 and WO 97/32977, and in McBride et al. (1994) Proc. Natl. Acad. Sci. USA 91, 7301-7305, all incorporated herein by reference in their entirety. The basic technique for plastid transformation involves introducing regions of cloned plastid DNA flanking a selectable marker together with the nucleotide sequence into a suitable target tissue, e.g., using biolistics or protoplast transformation (e.g., calcium chloride or PEG mediated transformation). The 1 to 1.5 kb flanking regions, termed targeting sequences, facilitate homologous recombination with the plastid genome and thus allow the replacement or modification of specific regions of the plastome. Initially, point mutations in the chloroplast 16S rRNA and rps12 genes conferring resistance to spectinomycin and/or streptomycin are utilized as selectable markers for transformation (Svab, Z., Hajdukiewicz, P., and Maliga, P. (1990) Proc. Natl. Acad. Sci. USA 87, 8526-8530; Staub, J. M., and Maliga, P. (1992) Plant Cell 4, 39-45). The presence of cloning sites between these markers allowed creation of a plastid targeting vector for introduction of foreign genes (Staub, J. M., and Maliga, P. (1993) EMBO J. 12, 601-606). Substantial increases in transformation frequency are obtained by replacement of the recessive rRNA or r-protein antibiotic resistance genes with a dominant selectable marker, the bacterial aadA gene encoding the spectinomycin-detoxifying enzyme aminoglycoside-3′-adenyltransferase (Svab, Z., and Maliga, P. (1993) Proc. Natl. Acad. Sci. USA 90, 913-917). Other selectable markers useful for plastid transformation are known in the art and encompassed within the scope of the invention.

[0167] X. Breeding

[0168] The wild-type or altered form of a GT1802, GT1209, GT1354, or GT0946 gene of the present invention can be utilized to confer herbicide tolerance to a wide variety of plant cells, including those of gymnosperms, monocots, and dicots. Although the gene can be inserted into any plant cell falling within these broad classes, it is particularly useful in crop plant cells, such as rice, wheat, barley, rye, corn, potato, carrot, sweet potato, sugar beet, bean, pea, chicory, lettuce, cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, garlic, eggplant, pepper, celery, carrot, squash, pumpkin, zucchini, cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine, apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado, papaya, mango, banana, soybean, tobacco, tomato, sorghum and sugarcane.

[0169] The high-level expression of a wild-type GT1802, GT1209, GT1354, or GT0946 gene and/or the expression of herbicide-tolerant forms of a GT1802, GT1209, GT1354, or GT0946 gene conferring herbicide tolerance in plants, in combination with other characteristics important for production and quality, can be incorporated into plant lines through breeding approaches and techniques known in the art.

[0170] Where a herbicide tolerant GT1802, GT1209, GT1354, or GT0946 gene allele is obtained by direct selection in a crop plant or plant cell culture from which a crop plant can be regenerated, it is moved into commercial varieties using traditional breeding techniques to develop a herbicide tolerant crop without the need for genetically engineering the allele and transforming it into the plant.

[0171] The invention will be further described by reference to the following detailed examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified.

EXAMPLES

[0172] Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, et al., Molecular Cloning, eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) and by T. J. Silhavy, M. L. Berman, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, pub. by Greene Publishing Assoc. and Wiley-Interscience (1987), Reiter, et al., Methods in Arabidopsis Research, World Scientific Press (1992), and Schultz et al., Plant Molecular Biology Manual, Kluwer Academic Publishers (1998). These references describe the standard techniques used for all steps in tagging and cloning genes from Ac/Ds transposon mutagenized populations of Arabidopsis: plant infection and transformation; screening for the identification of seedling mutants; and cosegregation analysis. Ds transposon insertion lines produced as described in Sundareson et al. (1995) Genes and Dev., 9:1797-1810) were used in these experiments.

Example 1

[0173] Transposon Border Isolation

[0174] Arabidopsis genomic DNA is isolated from line GT1802, GT1209, GT1354, or GT0946 using the Nucleon PhytoPure™ Plant DNA Isolation Kit (Amersham International plc, Buckinghamshire, England). Fragments of genomic DNA flanking the borders of the transposon are isolated using the TAIL-PCR technique (Liu et al. (1995) The Plant Journal, 8:457-463; Liu and Whittier (1995), Genomics, 25: 674-681). Three sets of 12 TAIL-PCR reactions, referred to as the primary, secondary and tertiary reactions, are performed. In each reaction, one arbitrary degenerate primer and one transposon-specific primer are used. The arbitrary degenerate primer is chosen from among six primers, LWAD1, CA51, CA52, CA53, CA54, and CA55 (Table 1), which are used to prime the genomic DNA flanking the insertion. These degenerate primers are used in combination with two sets of three, nested, transposon-specific primers (Table 2). These primers are homologous to regions of the Ds elements which lie at the outermost ends of the transposons, DS5 at the 5′ end (primers 5A, 5B, and 5C) and DS3 at the 3′ end (primers 3A, 3B, and 3C). When the degenerate and nested primer pairs are used in a series of low and high-stringency PCR amplifications, as described in the TAIL-PCR protocol (Liu and Whittier (1995), Genomics, 25: 674-681), DNA fragments are produced which correspond to the genomic DNA that is directly adjacent to the transposon insertion. The nucleic acid sequence of the PCR products from the tertiary TAIL-PCR reactions are then determined by standard molecular biology techniques. The resulting sequences are analyzed for the presence of non-Ds transposon vector sequence. To confirm the integrity of the resultant products, PCR primers specific to the flanking genomic region are designed and used in conjunction with the tertiary nested primer in a PCR reaction, to confirm the transposon insertion point within the genomic DNA. Finding a PCR product of the appropriate size, based on the sequence of the TAIL-PCR clone confirms a valid rescue. TABLE 1 DEGENERATE PRIMERS ID NO PRIMER DEGEN. PRIMER SEQUENCE NOTES AND REFERENCES 13 LWAD1 1026 NGT TGW GNA TWT SGW GNT designed by L. Wegrich 14 CA51  128 TGW GNA GSA NCA SAG derivative of primer AD1₍₂₎ 15 CA52  128 AGW GNA GWA NCA WAG G identical to primer AD2₍₂₎ 16 CA53  256 STT GNT AST NCT NTG C identical to primer AD5₍₃₎ 17 CA54  64 NTC GAS TWT SGW GTT identical to primer AD1₍₁₎ 18 CA55  256 WGT GNA GWA NCA NAG A identical to primer AD3₍₁₎

[0175] TABLE 2 NESTED PRIMERS ID NO PRIMER PRIMER SEQUENCE NOTES 19 5A ACTAGCTCTACCGTTTCCGTTTCCGTTTAC DS5 PRIMARY 20 5B TTACCTCGGGTTCGAAATCGATCGGGATAA DS5 SECONDARY 21 5C AAAATCGGTTATACGATAACGGTCGGTACGGGA DS5 TERTIARY 22 3A GGGTCTTGCGGATCTGAATATATGTTTTCATGTGTG DS3 PRIMARY 23 3B TACCGAAGAAAAATACCGGTTCCCGTCCGATTTCGAC DS3 SECONDARY 24 3C GGATCGTATCGGTTTTCGATTACCGTATTTATCC DS3 TERTIARY

[0176] References: 1. Liu et al. (1995)The Plant Journal, 8:457-463; 2. Liu and Whittier (1995) Genomics, 25: 674-681; 3. Tsugeki et al. (1996) The Plant Journal, 10: 479-489

Example 2

[0177] Sequence Analysis of Tagged Seedling Lethal Line GT1802

[0178] For transposant line GT1802, PCR products are obtained from the Ds3 border and the Ds5 border. The preliminary sequences, obtained from the TAIL-PCR, are used in BLASTn searches against nucleotide databases (Altschul et al. (1990) J Mol. Biol. 215:403-410; Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The initial sequence obtained for the region bordering the Ds3 end of the transposon indicates that the transposon has inserted with the Ds3 end adjacent to Arabidopsis genomic DNA (base numbers 72998 and higher of Arabidopsis chromosome 4, BAC F4C21, Genbank accession number AC005275). The initial sequence of the region bordering the Ds5 end indicates that the transposon has inserted with the Ds5 end adjacent to Arabidopsis genomic DNA (base numbers 72991 and lower of Arabidopsis chromosome 4, BAC F4C21). Analysis of the border sequences reveals a nine base pair duplication that occurred during the transposon insertion, corresponding to bases 72991 through 72998 of BAC F4C21. The transposon insertion region of BAC F4C21 is annotated as encoding a putative component of the cytochrome B6-F complex (GenPept accession number CAB52433).

[0179] The ORF for this gene has been identified and deposited in GenBank (accession number AJ243702).

[0180] Analysis of the cDNA sequence from this gene reveals a high degree of sequence similarity to other proteins identified as components of the cytochrome B6-F complex (see homolog table GT1802). A polymorphism is noted between the A. thaliana ecotype Landsberg cDNA and ecotype Columbia genomic DNA. The polymorphic difference is shown in the table below: Position of base in cDNA cDNA base genomic DNA base 142 G T GT1802 HOMOLOGS Description Accession # Database % ID Rieske iron-sulfur protein CAA53947 GenPept 61.9 Of cytochrome B6/F complex (Chlamydomonas reinhardtii) Rieske iron-sulfur precursor AAC78103 GenPept 76.8 Protein (Oryza sativa) Plastoquinol-plastocyanin CAA41421 GenPept 57.8 Reductase (Synechocystis) Chloroplast Rieske iron-sulfur CAA45151 GenPept 76.2 Protein DNA X63605 Genbank 73.7 (Pisum sativum) Rieske iron-sulfur precursor CAA29590 GenPept 77.9 (Spinacia oleracea) Rieske iron-sulfur protein Q02585 SwissProt 78.9 (Nicotiana tabacum)

Example 3

[0181] Sequence Analysis of Tagged Seedling Lethal Line GT1209

[0182] For transposant line GT1209, PCR products are obtained from the Ds5 border. The preliminary sequences obtained from the TAEL-PCR are used in BLASTn searches against nucleotide databases (Altschul et al. (1990) J Mol. Biol. 215:403-410; Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The initial sequence of the region bordering the Ds5 end indicates that the transposon has inserted with the Ds5 end adjacent to Arabidopsis genomic DNA (base numbers 33100 and higher of Arabidopsis BAC F12K11, Genbank accession number AC007592). This region of BAC F12K11 is annotated as encoding a protein of unknown function (GenPept accession number AAF248 11).

[0183] To identify the ORF for this gene, primers are designed to the 5′ and 3′ ends of the predicted ORF. PCR is performed using template DNA from the pFL61 Arabidopsis cDNA library (Minet et al. (1992) Plant J. 2: 417-422). A resulting PCR product is TA-cloned (Original TA-Cloning kit, Invitrogen) and sequenced. The cDNA sequence differs from the sequence predicted in the Genbank annotation, thus identifying for the first time the actual open reading frame. Analysis of the cDNA sequence from this gene reveals a high degree of similarity to a component of the anaphase promoting complex in human and mouse (see homolog table GT1209). A few polymorphisms are noted between the A. thaliana ecotype Landsberg cDNA and ecotype Columbia genomic DNA. These polymorphic differences are shown in the table below: Position of base in cDNA cDNA base genomic DNA base 154 C T 188 A T 766 T A 1335 C T 1938 A G 2637 G C 2653 C T

[0184] A second cDNA PCR product of higher molecular weight is identified, TA cloned, and sequenced (SEQ ID NO:25). The resulting sequence is analyzed and may represent an incompletely spliced mRNA transcript of the above mentioned cDNA. The higher molecular weight clone (3236 nucleotides) is missing 5 nucleotides corresponding to nucleotides 32104-32108 of BAC F12K11, when compared to the lower molecular weight cDNA clone (2746 nucleotides). Extra nucleotide sequence in the higher molecular weight cDNA corresponds to nucleotides 35359-35400, 35416-35663, and 35770-35957 of BAC F12K11 when compared to the lower molecular weight cDNA clone. These base differences noted, result in an ORF of only 1053 nucleotides in the higher molecular weight clone (the corresponding translation is in SEQ ID NO:26), while the ORF of the lower molecular weight cDNA is 2746 nucleotides in length. GT1209 HOMOLOGS Description Accession # Database % ID Unnamed protein product BAA95076 GenPept 30.7 (Mus musculus) Anaphase-promoting complex AAF05753 GenPept 30.3 Subunit 5 DNA AF191339 Genbank 38.9 (Homo sapiens)

Example 4

[0185] Sequence Analysis of Tagged Seedling Lethal Line GT1354

[0186] For transposant line GT1354, PCR products are obtained from the Ds3 border and from the Ds5 border. The preliminary sequences obtained from the TAIL-PCR are used in BLASTn searches against nucleotide databases (Altschul et al. (1990) J Mol. Biol. 215:403-410; Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The initial sequence of the region bordering the Ds3 end indicates that the transposon has inserted with the Ds3 end adjacent to Arabidopsis genomic DNA (base numbers 54823 and lower of Arabidopsis section 179 of 255 on chromosome 2, Genbank accession number AC006533). The initial sequence of the region bordering the Ds5 end indicates that the transposon has inserted with the Ds5 end adjacent to Arabidopsis genomic DNA (base numbers 54823 and higher of Arabidopsis section 179 of 255 on chromosome 2, Genbank accession number AC006533). This region of BAC F20M17 is annotated as encoding a putative protein of unknown function (GenPept accession number AAB32288).

[0187] To identify the ORF for this gene, primers are designed to the 5′ and 3′ ends of the predicted ORF. PCR is performed using template DNA from the pFL61 Arabidopsis cDNA library (Minet et al. (1992) Plant J. 2: 417-422). The resulting PCR product is TA-cloned (Original TA-Cloning kit, Invitrogen) and sequenced. The cDNA sequence is the same as the sequence predicted in the Genbank annotation, thus validating for the first time the putative open reading frame annotation. Analysis of the cDNA sequence from this gene reveals a high degree of sequence similarity with other Arabidopsis hypothetical proteins (see homolog table GT1354). A polymorphism is noted between the A. thaliana ecotype Landsberg cDNA and ecotype Columbia genomic DNA. The polymorphic difference is shown in the table below: Position of base in cDNA cDNA base genomic DNA base 140 T C GT1354 HOMOLOGS Description Accession # Database % ID Hypothetical protein CAB81447 GenPept 35.2 (Arabidopsis thaliana) Chromosome 4, contig AL161573 Genbank 41.2 Fragment No. 69 (Arabidopsis thaliana)

Example 5

[0188] Sequence Analysis of Tagged Seedling Lethal Line GT0946

[0189] For transposant line GT0946, PCR products are obtained from the Ds3 border and from the Ds5 border. The preliminary sequences obtained from TAIL-PCR are used in BLASTn searches against nucleotide databases (Altschul et al. (1990) J Mol. Biol. 215:403-410; Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The initial sequence of the region bordering the Ds3 end indicates that the transposon has inserted with the Ds3 end adjacent to Arabidopsis genomic DNA (base numbers 21164 and higher of Arabidopsis section 10 of 255 on chromosome 2, Genbank accession number AC004136). The initial sequence of the region bordering the Ds5 end indicates that the transposon has inserted with the Ds5 end adjacent to Arabidopsis genomic DNA (base numbers 21173 and lower of Arabidopsis section 10 of 255 on chromosome 2, Genbank accession number AC004136). Analysis of the border sequences reveals a nine base pair duplication that occurred during the transposon insertion, corresponding to bases 21164 through 21173 of BAC T8K22. This region of section 10 of 255 on chromosome 2 is annotated as encoding a putative sugar nucleotide phosphorylase (GenPept accession number ACC18936).

[0190] To identify the ORF for this gene, primers are designed to the 5′ and 3′ ends of the predicted ORF. PCR is performed using template DNA from the pFL61 Arabidopsis cDNA library (Minet et al. (1992) Plant J. 2: 417-422). The resulting PCR product is TA-cloned (Original TA-Cloning kit, Invitrogen) and sequenced. The cDNA sequence differs from the sequence predicted in the Genbank annotation, thus identifying for the first time the actual open reading frame. Analysis of the cDNA sequence from this gene reveals that it is identical with an Arabidopsis cDNA encoding a 4-Diphosphocytidyl-2C-methyl-D-erythritol synthase (Genbank accession number AF230737) and similar to homologs from other species (see homolog table GT0946).

[0191] A few polymorphisms are noted between the A. thaliana ecotype Landsberg cDNA and ecotype Columbia genomic DNA. These polymorphic differences are shown in the table below: Position of base in cDNA cDNA base genomic DNA base 88 G T 300 T C 531 T C 643 G C 660 C G 741 A G 756 G A 819 A T GT0946 HOMOLOGS Description Accession # Database % ID Unknown function AAA21794 GenPept 38.4 (Bacillus subtilis) Conserved hypothetical AAC22332 GenPept 32.6 Protein (Haemophilus influenzae) 4-diphosphocytidyl-2C-methyl -D-erythritol synthase AAF43207 GenPept 33.0 (Escherichia coli) Hypothetical protein CAB07156 GenPept 33.2 Rv3582c (Mycobacterium tuberculosis) Hypothetical protein BAA18417 GenPept 33.0 (Synechocystis sp) Brassica napus AI352824 Genbank 84.6 DNA (EST, partial cDNA)

Example 6

[0192] Expression of Recombinant GT1802, GT1209, GT1354, or GT0946 Protein in E. coli

[0193] The coding region of the protein, corresponding to the cDNA clone SEQ ID NO:1 is subcloned into an appropriate expression vector, and transformed into E. coli using the manufacturer's conditions. Specific examples include plasmids such as pBluescript (Stratagene, La Jolla, Calif.), pFLAG (International Biotechnologies, Inc., New Haven, Conn.), and pTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, and expression of the GT1802 activity is confirmed. Protein conferring GT1802 activity is isolated using standard techniques.

[0194] The coding region of the protein, corresponding to the cDNA clone SEQ ID NO:3 is subcloned into an appropriate expression vector, and transformed into E. coli using the manufacturer's conditions. Specific examples include plasmids such as pBluescript (Stratagene, La Jolla, Calif.), pFLAG (International Biotechnologies, Inc., New Haven, Conn.), and pTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, and expression of the GT1209 activity is confirmed. Protein conferring GT1209 activity is isolated using standard techniques.

[0195] The coding region of the protein, corresponding to the cDNA clone SEQ ID NO:5, is subcloned into an appropriate expression vector, and transformed into E. coli using the manufacturer's conditions. Specific examples include plasmids such as pBluescript (Stratagene, La Jolla, Calif.), pFLAG (International Biotechnologies, Inc., New Haven, Conn.), and pTrclis (Invitrogen, La Jolla, Calif.). E. coli is cultured, and expression of the GT1354 activity is confirmed. Protein conferring GT1354 activity is isolated using standard techniques.

[0196] The coding region of the protein, corresponding to the cDNA clone SEQ ID NO:7, is subcloned into an appropriate expression vector, and transformed into E. coli using the manufacturer's conditions. Specific examples include plasmids such as pBluescript (Stratagene, La Jolla, Calif.), pFLAG (International Biotechnologies, Inc., New Haven, Conn.), and pTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, and expression of the GT0946 activity is confirmed. Protein conferring GT0946 activity is isolated using standard techniques.

Example 7

[0197] In vitro Recombination of GT1802, GT1209, GT1354, or GT0946 Genes by DNA Shuffling

[0198] The nucleotide sequence of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7 is amplified by PCR. The resulting DNA fragment is digested by DNaseI treatment essentially as described (Stemmer et al. (1994) PNAS 91: 10747-10751) and the PCR primers are removed from the reaction mixture. A PCR reaction is carried out without primers and is followed by a PCR reaction with the primers, both as described (Stemmer et al. (1994) PNAS 91: 10747-10751). The resulting DNA fragments are cloned into pTRC99a (Pharmacia, Cat no: 27-5007-01) for use in bacteria, and transformed into a bacterial strain deficient in GT1802, GT1209, GT1354, or GT0946 activity by electroporation using the Biorad Gene Pulser and the manufacturer's conditions. The transformed bacteria are grown on medium that contains inhibitory concentrations of an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity, respectively, and those colonies that grow in the presence of the inhibitor are selected. Colonies that grow in the presence of normally inhibitory concentrations of inhibitor are picked and purified by repeated restreaking. Their plasmids are purified and the DNA sequences of cDNA inserts from plasmids that pass this test are then determined. Alternatively, the DNA fragments are cloned into expression vectors for transient or stable transformation into plant cells, which are screened for differential survival and/or growth in the presence of an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity. In a similar reaction, PCR-amplified DNA fragments comprising the Arabidopsis GT1802, GT1209, GT1354, or GT0946 gene encoding the protein and PCR-amplified DNA fragments derived from or comprising another GT1802, GT1209, GT1354, or GT0946 gene are recombined in vitro and resulting variants with improved tolerance to the inhibitor are recovered as described above.

Example 8

[0199] In vitro Recombination of GT1802, GT1209, GT1354, or GT0946 Genes by Staggered Extension Process

[0200] The Arabidopsis GT1802 gene encoding the protein and another GT1802 gene, or homolog thereof, or fragment thereof, are each cloned into the polylinker of a pBluescript vector. A PCR reaction is carried out essentially as described (Zhao et al. (1998) Nature Biotechnology 16: 258-261) using the “reverse primer” and the “M13 -20 primer” (Stratagene Catalog). Amplified PCR fragments are digested with appropriate restriction enzymes and cloned into pTRC99a and mutated GT1802 genes are screened as described in Example 7. The same procedure is carried out with genes encoding GT1209, GT1354, or GT0946 proteins, respectively.

Example 9

[0201] In Vitro Binding Assays

[0202] Recombinant GT1802 protein is obtained, for example, according to Example 6. The protein is immobilized on chips appropriate for ligand binding assays using techniques which are well known in the art. The protein immobilized on the chip is exposed to sample compound in solution according to methods well know in the art. While the sample compound is in contact with the immobilized protein measurements capable of detecting protein-ligand interactions are conducted. Examples of such measurements are SELDI, biacore and FCS, described above. Compounds found to bind the protein are readily discovered in this fashion and are subjected to further characterization.

[0203] Recombinant GT1209 protein is obtained, for example, according to Example 6. The protein is immobilized on chips appropriate for ligand binding assays using techniques which are well known in the art. The protein immobilized on the chip is exposed to sample compound in solution according to methods well know in the art. While the sample compound is in contact with the immobilized protein measurements capable of detecting protein-ligand interactions are conducted. Examples of such measurements are SELDI, biacore and FCS, described above. Compounds found to bind the protein are readily discovered in this fashion and are subjected to further characterization.

[0204] Recombinant GT1354 protein is obtained, for example, according to Example 6. The protein is immobilized on chips appropriate for ligand binding assays using techniques which are well known in the art. The protein immobilized on the chip is exposed to sample compound in solution according to methods well know in the art. While the sample compound is in contact with the immobilized protein measurements capable of detecting protein-ligand interactions are conducted. Examples of such measurements are SELDI, biacore and FCS, described above. Compounds found to bind the protein are readily discovered in this fashion and are subjected to further characterization.

[0205] Recombinant GT0946 protein is obtained, for example, according to Example 6. The protein is immobilized on chips appropriate for ligand binding assays using techniques which are well known in the art. The protein immobilized on the chip is exposed to sample compound in solution according to methods well know in the art. While the sample compound is in contact with the immobilized protein measurements capable of detecting protein-ligand interactions are conducted. Examples of such measurements are SELDI, biacore and FCS, described above. Compounds found to bind the protein are readily discovered in this fashion and are subjected to further characterization.

Example 10

[0206] Plastid Transformation

[0207] Transformation Vectors

[0208] For expression of a nucleotide sequence encoding a polypeptide having GT1802, GT1209, GT1354, or GT0946 activity encoding in plant plastids, plastid transformation vector pPH143 or pPH145 (WO 97/32011) is used; and this reference is incorporated herein by reference. The nucleotide sequence is inserted into pPH143 thereby replacing the PROTOX coding sequence. This vector is then used for plastid transformation and selection of transformants for spectinomycin resistance. Alternatively, the nucleotide sequence is inserted in pPH143 so that it replaces the aadH gene. In this case, transformants are selected for resistance to PROTOX inhibitors.

[0209] Plastid Transformation

[0210] Seeds of Nicotiana tabacum c.v. ‘Xanthi nc’ are germinated seven per plate in a 1″ circular array on T agar medium and bombarded 12-14 days after sowing with 1 μm tungsten particles (M10, Biorad, Hercules, Calif.) coated with DNA from plasmids pPH143 and pPH145 essentially as described (Svab, Z. and Maliga, P. (1993) Proc. Natl. Acad. Sci. USA 90, 913-917). Bombarded seedlings are incubated on T medium for two days after which leaves are excised and placed abaxial side up in bright light (350-500 μmol photons/m²/s) on plates of RMOP medium (Svab, Z., Hajdukiewicz, P. and Maliga, P. (1990) Proc. Natl. Acad. Sci. USA 87, 8526-8530) containing 500 μg/ml spectinomycin dihydrochloride (Sigma, St. Louis, Mo.). Resistant shoots appearing underneath the bleached leaves three to eight weeks after bombardment are subcloned onto the same selective medium, allowed to form callus, and secondary shoots isolated and subcloned. Complete segregation of transformed plastid genome copies (homoplasmicity) in independent subclones is assessed by standard techniques of Southern blotting (Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor). Homoplasmic shoots are rooted aseptically on spectinomycin-containing MS/IBA medium (McBride, K. E. et al. (1994) Proc. Natl. Acad. Sci. USA 91, 7301-7305) and transferred to the greenhouse.

[0211] The above disclosed embodiments are illustrative. This disclosure of the invention will place one skilled in the art in possession of many variations of the invention. All such obvious and foreseeable variations are intended to be encompassed by the appended claims.

1 26 1 690 DNA Arabidopsis thaliana CDS (1)..(690) 1 atg gcg tcc tca tcc ctt tcc cct gct act cag ctt ggt tct agc aga 48 Met Ala Ser Ser Ser Leu Ser Pro Ala Thr Gln Leu Gly Ser Ser Arg 1 5 10 15 agt gct ttg atg gcg atg tca agt ggg ttg ttt gtg aag cca acg aag 96 Ser Ala Leu Met Ala Met Ser Ser Gly Leu Phe Val Lys Pro Thr Lys 20 25 30 atg aat cat caa atg gtt aga aaa gag aag att gga ttg aga att gct 144 Met Asn His Gln Met Val Arg Lys Glu Lys Ile Gly Leu Arg Ile Ala 35 40 45 tgt caa gcg tcg agt att cca gca gac aga gtt cca gat atg gaa aag 192 Cys Gln Ala Ser Ser Ile Pro Ala Asp Arg Val Pro Asp Met Glu Lys 50 55 60 agg aag act ttg aat ctt ctt ctt ctt ggg gct ctt tct cta cct act 240 Arg Lys Thr Leu Asn Leu Leu Leu Leu Gly Ala Leu Ser Leu Pro Thr 65 70 75 80 ggc tac atg ctt gtc cct tac gct acc ttc ttt gtt cct cct gga acc 288 Gly Tyr Met Leu Val Pro Tyr Ala Thr Phe Phe Val Pro Pro Gly Thr 85 90 95 gga ggt gga ggt ggt ggt act cca gcc aag gat gcc ctt gga aac gat 336 Gly Gly Gly Gly Gly Gly Thr Pro Ala Lys Asp Ala Leu Gly Asn Asp 100 105 110 gta gtt gca gcg gaa tgg ctt aag act cat ggt ccc ggt gac cga acc 384 Val Val Ala Ala Glu Trp Leu Lys Thr His Gly Pro Gly Asp Arg Thr 115 120 125 ttg acc caa gga tta aag gga gat ccg act tac cta gtt gta gag aac 432 Leu Thr Gln Gly Leu Lys Gly Asp Pro Thr Tyr Leu Val Val Glu Asn 130 135 140 gac aag act cta gcg aca tac ggt atc aac gca gtg tgc act cat ctt 480 Asp Lys Thr Leu Ala Thr Tyr Gly Ile Asn Ala Val Cys Thr His Leu 145 150 155 160 gga tgt gtt gtg cca tgg aac aaa gct gag aac aag ttt cta tgt cct 528 Gly Cys Val Val Pro Trp Asn Lys Ala Glu Asn Lys Phe Leu Cys Pro 165 170 175 tgc cat gga tcc caa tac aac gcc caa gga aga gtc gtt aga ggt cca 576 Cys His Gly Ser Gln Tyr Asn Ala Gln Gly Arg Val Val Arg Gly Pro 180 185 190 gcc cca ttg tcg cta gcg ttg gct cac gcg gat ata gat gaa gct ggg 624 Ala Pro Leu Ser Leu Ala Leu Ala His Ala Asp Ile Asp Glu Ala Gly 195 200 205 aag gtt ctt ttt gtt cca tgg gtg gaa act gac ttc agg act ggt gat 672 Lys Val Leu Phe Val Pro Trp Val Glu Thr Asp Phe Arg Thr Gly Asp 210 215 220 gct cca tgg tgg tct taa 690 Ala Pro Trp Trp Ser 225 230 2 229 PRT Arabidopsis thaliana 2 Met Ala Ser Ser Ser Leu Ser Pro Ala Thr Gln Leu Gly Ser Ser Arg 1 5 10 15 Ser Ala Leu Met Ala Met Ser Ser Gly Leu Phe Val Lys Pro Thr Lys 20 25 30 Met Asn His Gln Met Val Arg Lys Glu Lys Ile Gly Leu Arg Ile Ala 35 40 45 Cys Gln Ala Ser Ser Ile Pro Ala Asp Arg Val Pro Asp Met Glu Lys 50 55 60 Arg Lys Thr Leu Asn Leu Leu Leu Leu Gly Ala Leu Ser Leu Pro Thr 65 70 75 80 Gly Tyr Met Leu Val Pro Tyr Ala Thr Phe Phe Val Pro Pro Gly Thr 85 90 95 Gly Gly Gly Gly Gly Gly Thr Pro Ala Lys Asp Ala Leu Gly Asn Asp 100 105 110 Val Val Ala Ala Glu Trp Leu Lys Thr His Gly Pro Gly Asp Arg Thr 115 120 125 Leu Thr Gln Gly Leu Lys Gly Asp Pro Thr Tyr Leu Val Val Glu Asn 130 135 140 Asp Lys Thr Leu Ala Thr Tyr Gly Ile Asn Ala Val Cys Thr His Leu 145 150 155 160 Gly Cys Val Val Pro Trp Asn Lys Ala Glu Asn Lys Phe Leu Cys Pro 165 170 175 Cys His Gly Ser Gln Tyr Asn Ala Gln Gly Arg Val Val Arg Gly Pro 180 185 190 Ala Pro Leu Ser Leu Ala Leu Ala His Ala Asp Ile Asp Glu Ala Gly 195 200 205 Lys Val Leu Phe Val Pro Trp Val Glu Thr Asp Phe Arg Thr Gly Asp 210 215 220 Ala Pro Trp Trp Ser 225 3 2751 DNA Arabidopsis thaliana CDS (1)..(2751) 3 atg gcc gga tta acg aga acg gcc ggt gct ttt gcg gta act cca cac 48 Met Ala Gly Leu Thr Arg Thr Ala Gly Ala Phe Ala Val Thr Pro His 1 5 10 15 aag atc tcc gtt tgc att ctc ctg cag ata tac gct cct tcc gct cag 96 Lys Ile Ser Val Cys Ile Leu Leu Gln Ile Tyr Ala Pro Ser Ala Gln 20 25 30 atg tct ctt cct ttt cct ttc tct tcc gtt gct cag cac aac cgc ctc 144 Met Ser Leu Pro Phe Pro Phe Ser Ser Val Ala Gln His Asn Arg Leu 35 40 45 ggc ctc tac ctg ctc tct ctt act aag tct tgc gat gat ata tat gag 192 Gly Leu Tyr Leu Leu Ser Leu Thr Lys Ser Cys Asp Asp Ile Tyr Glu 50 55 60 ccg aag ctg gaa aag ctc atc aac cag ttg agg gaa gtt ggt gaa gag 240 Pro Lys Leu Glu Lys Leu Ile Asn Gln Leu Arg Glu Val Gly Glu Glu 65 70 75 80 atg gac gcg tgg cta act gac cat tta act aat aga ttt tcc tct ttg 288 Met Asp Ala Trp Leu Thr Asp His Leu Thr Asn Arg Phe Ser Ser Leu 85 90 95 gct tca cca gat gat cta tta aat ttc ttt aat gac atg cga gga ata 336 Ala Ser Pro Asp Asp Leu Leu Asn Phe Phe Asn Asp Met Arg Gly Ile 100 105 110 ctt ggg agc ctt gat tca gga gtc gtg caa gat gat cag att att ttg 384 Leu Gly Ser Leu Asp Ser Gly Val Val Gln Asp Asp Gln Ile Ile Leu 115 120 125 gat ccc aac agc aac ttg gga atg ttt gtt cgt cgt tgc att ttg gca 432 Asp Pro Asn Ser Asn Leu Gly Met Phe Val Arg Arg Cys Ile Leu Ala 130 135 140 ttc aac ctt tta tcg ttc gag gga gtt tgt cat ctt ttt tca agt att 480 Phe Asn Leu Leu Ser Phe Glu Gly Val Cys His Leu Phe Ser Ser Ile 145 150 155 160 gaa gat tac tgc aaa gaa gcc cat tca agc ttt gct cag ttt ggt gca 528 Glu Asp Tyr Cys Lys Glu Ala His Ser Ser Phe Ala Gln Phe Gly Ala 165 170 175 cct aat aat aat ctg gag tca tta ata caa tat gat cag atg gat atg 576 Pro Asn Asn Asn Leu Glu Ser Leu Ile Gln Tyr Asp Gln Met Asp Met 180 185 190 gag aat tat gca atg gat aaa cca act gaa gaa ata gag ttt cag aaa 624 Glu Asn Tyr Ala Met Asp Lys Pro Thr Glu Glu Ile Glu Phe Gln Lys 195 200 205 act gct agt gga att gtc cct ttt cac ctt cat aca cca gat tca ctt 672 Thr Ala Ser Gly Ile Val Pro Phe His Leu His Thr Pro Asp Ser Leu 210 215 220 atg aaa gcg aca gaa ggt ttg cta cat aat agg aag gaa aca tca agg 720 Met Lys Ala Thr Glu Gly Leu Leu His Asn Arg Lys Glu Thr Ser Arg 225 230 235 240 acc agc aag aaa gat aca gaa gct act cca gtt gct cgt gcc tca tca 768 Thr Ser Lys Lys Asp Thr Glu Ala Thr Pro Val Ala Arg Ala Ser Ser 245 250 255 agt aca ctt gag gaa tct ctg gta gat gag tca tta ttc ctt cgg aca 816 Ser Thr Leu Glu Glu Ser Leu Val Asp Glu Ser Leu Phe Leu Arg Thr 260 265 270 aat ttg cag ata caa ggc ttt tta atg gaa cag gcc gat gca att gaa 864 Asn Leu Gln Ile Gln Gly Phe Leu Met Glu Gln Ala Asp Ala Ile Glu 275 280 285 atc cat gga agt tca agt tca ttc tct tca agt tcc atc gaa agt ttc 912 Ile His Gly Ser Ser Ser Ser Phe Ser Ser Ser Ser Ile Glu Ser Phe 290 295 300 ctt gat cag ctt cag aaa tta gcc cct gaa ctg cat cgt gtt cac ttt 960 Leu Asp Gln Leu Gln Lys Leu Ala Pro Glu Leu His Arg Val His Phe 305 310 315 320 ttg cgt tac ttg aat aaa ctt cac agt gat gac tac ttt gct gct ttg 1008 Leu Arg Tyr Leu Asn Lys Leu His Ser Asp Asp Tyr Phe Ala Ala Leu 325 330 335 gat aat ctc ctc cgt tac ttt gat tac agt gca ggg act gag gga ttt 1056 Asp Asn Leu Leu Arg Tyr Phe Asp Tyr Ser Ala Gly Thr Glu Gly Phe 340 345 350 gac ctt gtt cct cct tca act ggc tgc agc atg tat gga agg tac gag 1104 Asp Leu Val Pro Pro Ser Thr Gly Cys Ser Met Tyr Gly Arg Tyr Glu 355 360 365 att ggt ttg cta tgt ctg gga atg atg cat ttc cga ttt ggg cat cct 1152 Ile Gly Leu Leu Cys Leu Gly Met Met His Phe Arg Phe Gly His Pro 370 375 380 aat ctg gct cta gag gtt ttg aca gaa gct gtg cgt gta tca cag cag 1200 Asn Leu Ala Leu Glu Val Leu Thr Glu Ala Val Arg Val Ser Gln Gln 385 390 395 400 ctt agt aat gat act tgt cta gca tat acg cta gca gca atg agc aac 1248 Leu Ser Asn Asp Thr Cys Leu Ala Tyr Thr Leu Ala Ala Met Ser Asn 405 410 415 ttg tta tcg gaa atg ggc att gca agt acc tcc ggt gtt ctc gga tcc 1296 Leu Leu Ser Glu Met Gly Ile Ala Ser Thr Ser Gly Val Leu Gly Ser 420 425 430 tca tac tca ccc gtc act agc act gcg tct tca tta tcc gta caa caa 1344 Ser Tyr Ser Pro Val Thr Ser Thr Ala Ser Ser Leu Ser Val Gln Gln 435 440 445 aga gtg tac ata ctt ttg aaa gag tct ttg agg aga gct gac agt cta 1392 Arg Val Tyr Ile Leu Leu Lys Glu Ser Leu Arg Arg Ala Asp Ser Leu 450 455 460 aag tta aga cgc tta gtg gct tct aat cat ctt gcg atg gct aaa ttt 1440 Lys Leu Arg Arg Leu Val Ala Ser Asn His Leu Ala Met Ala Lys Phe 465 470 475 480 gag ttg atg cat gtg caa agg cct cta ctg tca ttt ggt ccc aaa gct 1488 Glu Leu Met His Val Gln Arg Pro Leu Leu Ser Phe Gly Pro Lys Ala 485 490 495 tct atg cgt cac aaa act tgt cca gtt agt gtc tgc aag gaa ata aga 1536 Ser Met Arg His Lys Thr Cys Pro Val Ser Val Cys Lys Glu Ile Arg 500 505 510 cta ggg gca cac cta atc agc gac ttt tct tct gaa agc tct aca atg 1584 Leu Gly Ala His Leu Ile Ser Asp Phe Ser Ser Glu Ser Ser Thr Met 515 520 525 aca att gat ggt tct cta agc tcg gct tgg ctt aaa gac ttg caa aaa 1632 Thr Ile Asp Gly Ser Leu Ser Ser Ala Trp Leu Lys Asp Leu Gln Lys 530 535 540 cca tgg ggt cca cct gtg att tcc cca gac tcc ggt tct aga aaa agt 1680 Pro Trp Gly Pro Pro Val Ile Ser Pro Asp Ser Gly Ser Arg Lys Ser 545 550 555 560 tca act ttt ttt caa ctc tgt gat cat ttg gtc tca att cct gga tcc 1728 Ser Thr Phe Phe Gln Leu Cys Asp His Leu Val Ser Ile Pro Gly Ser 565 570 575 gtg tca caa tta ata ggt gct tct tat tta ctc cgg gct act tca tgg 1776 Val Ser Gln Leu Ile Gly Ala Ser Tyr Leu Leu Arg Ala Thr Ser Trp 580 585 590 gag tta tat ggc agc gct ccc atg gct cgg atg aat acc ttg gtg tat 1824 Glu Leu Tyr Gly Ser Ala Pro Met Ala Arg Met Asn Thr Leu Val Tyr 595 600 605 gca act tta ttc ggt gac tct tct agt tcg tct gac gca gag tta gca 1872 Ala Thr Leu Phe Gly Asp Ser Ser Ser Ser Ser Asp Ala Glu Leu Ala 610 615 620 tac ttg aag ctc att caa cat ttg gca cta tat aag gga tac aaa gat 1920 Tyr Leu Lys Leu Ile Gln His Leu Ala Leu Tyr Lys Gly Tyr Lys Asp 625 630 635 640 gcc ttt gct gct ctt aaa gtc gca gag gaa aag ttc tta acc gta tcg 1968 Ala Phe Ala Ala Leu Lys Val Ala Glu Glu Lys Phe Leu Thr Val Ser 645 650 655 aaa tca aaa gta ttg ttg ctc aag ttg caa cta cta cat gag cgt gcc 2016 Lys Ser Lys Val Leu Leu Leu Lys Leu Gln Leu Leu His Glu Arg Ala 660 665 670 ttg cat tgt ggg aat tta aaa cta gct caa cga ata tgt aat gag cta 2064 Leu His Cys Gly Asn Leu Lys Leu Ala Gln Arg Ile Cys Asn Glu Leu 675 680 685 gga ggc ttg gca tca aca gcc atg ggt gta gac atg gag cta aaa gta 2112 Gly Gly Leu Ala Ser Thr Ala Met Gly Val Asp Met Glu Leu Lys Val 690 695 700 gaa gca agt ctt cgt gaa gct cgg act ttg ctt gca gca aaa cag tat 2160 Glu Ala Ser Leu Arg Glu Ala Arg Thr Leu Leu Ala Ala Lys Gln Tyr 705 710 715 720 agc cag gca gca aat gtg gca cac tcc ctc ttc tgc aca tgt cac aaa 2208 Ser Gln Ala Ala Asn Val Ala His Ser Leu Phe Cys Thr Cys His Lys 725 730 735 ttc aat ttg caa atc gaa aag gcg tct gtt ctt ctt ctg ctc gca gag 2256 Phe Asn Leu Gln Ile Glu Lys Ala Ser Val Leu Leu Leu Leu Ala Glu 740 745 750 atc cat aag aag tca gga aat gct gtc ctg ggt ctt cca tat gcg ctg 2304 Ile His Lys Lys Ser Gly Asn Ala Val Leu Gly Leu Pro Tyr Ala Leu 755 760 765 gca agc atc tcg ttt tgc cag tca ttc aac ttg gat ctt ctc aaa gca 2352 Ala Ser Ile Ser Phe Cys Gln Ser Phe Asn Leu Asp Leu Leu Lys Ala 770 775 780 tca gct act ctc act ctg gcc gag ctt tgg ctt ggt ctt gga tca aat 2400 Ser Ala Thr Leu Thr Leu Ala Glu Leu Trp Leu Gly Leu Gly Ser Asn 785 790 795 800 cat acc aaa cga gca tta gac ctt ttg cat ggg gct ttc cct atg att 2448 His Thr Lys Arg Ala Leu Asp Leu Leu His Gly Ala Phe Pro Met Ile 805 810 815 ctt ggc cat gga ggt ttg gag ttg cgt gct cga gct tac atc ttt gaa 2496 Leu Gly His Gly Gly Leu Glu Leu Arg Ala Arg Ala Tyr Ile Phe Glu 820 825 830 gca aac tgc tat cta tct gat cca agt tct tca gtt tcc aca gat tct 2544 Ala Asn Cys Tyr Leu Ser Asp Pro Ser Ser Ser Val Ser Thr Asp Ser 835 840 845 gac act gtc ttg gat tct cta agg caa gct tca gat gag ctt caa gct 2592 Asp Thr Val Leu Asp Ser Leu Arg Gln Ala Ser Asp Glu Leu Gln Ala 850 855 860 ttg gag tac cat gaa ctg gca gcg gaa gcc tcg tac tta atg gcg atg 2640 Leu Glu Tyr His Glu Leu Ala Ala Glu Ala Ser Tyr Leu Met Ala Met 865 870 875 880 gta tat gac aag ctg gga cgg ctt gat gag agg gaa gaa gct gcg tct 2688 Val Tyr Asp Lys Leu Gly Arg Leu Asp Glu Arg Glu Glu Ala Ala Ser 885 890 895 ttg ttt aag aaa cat atc ata gct ctc gag aac cct caa gat gtg gaa 2736 Leu Phe Lys Lys His Ile Ile Ala Leu Glu Asn Pro Gln Asp Val Glu 900 905 910 caa aac atg gca tga 2751 Gln Asn Met Ala 915 4 916 PRT Arabidopsis thaliana 4 Met Ala Gly Leu Thr Arg Thr Ala Gly Ala Phe Ala Val Thr Pro His 1 5 10 15 Lys Ile Ser Val Cys Ile Leu Leu Gln Ile Tyr Ala Pro Ser Ala Gln 20 25 30 Met Ser Leu Pro Phe Pro Phe Ser Ser Val Ala Gln His Asn Arg Leu 35 40 45 Gly Leu Tyr Leu Leu Ser Leu Thr Lys Ser Cys Asp Asp Ile Tyr Glu 50 55 60 Pro Lys Leu Glu Lys Leu Ile Asn Gln Leu Arg Glu Val Gly Glu Glu 65 70 75 80 Met Asp Ala Trp Leu Thr Asp His Leu Thr Asn Arg Phe Ser Ser Leu 85 90 95 Ala Ser Pro Asp Asp Leu Leu Asn Phe Phe Asn Asp Met Arg Gly Ile 100 105 110 Leu Gly Ser Leu Asp Ser Gly Val Val Gln Asp Asp Gln Ile Ile Leu 115 120 125 Asp Pro Asn Ser Asn Leu Gly Met Phe Val Arg Arg Cys Ile Leu Ala 130 135 140 Phe Asn Leu Leu Ser Phe Glu Gly Val Cys His Leu Phe Ser Ser Ile 145 150 155 160 Glu Asp Tyr Cys Lys Glu Ala His Ser Ser Phe Ala Gln Phe Gly Ala 165 170 175 Pro Asn Asn Asn Leu Glu Ser Leu Ile Gln Tyr Asp Gln Met Asp Met 180 185 190 Glu Asn Tyr Ala Met Asp Lys Pro Thr Glu Glu Ile Glu Phe Gln Lys 195 200 205 Thr Ala Ser Gly Ile Val Pro Phe His Leu His Thr Pro Asp Ser Leu 210 215 220 Met Lys Ala Thr Glu Gly Leu Leu His Asn Arg Lys Glu Thr Ser Arg 225 230 235 240 Thr Ser Lys Lys Asp Thr Glu Ala Thr Pro Val Ala Arg Ala Ser Ser 245 250 255 Ser Thr Leu Glu Glu Ser Leu Val Asp Glu Ser Leu Phe Leu Arg Thr 260 265 270 Asn Leu Gln Ile Gln Gly Phe Leu Met Glu Gln Ala Asp Ala Ile Glu 275 280 285 Ile His Gly Ser Ser Ser Ser Phe Ser Ser Ser Ser Ile Glu Ser Phe 290 295 300 Leu Asp Gln Leu Gln Lys Leu Ala Pro Glu Leu His Arg Val His Phe 305 310 315 320 Leu Arg Tyr Leu Asn Lys Leu His Ser Asp Asp Tyr Phe Ala Ala Leu 325 330 335 Asp Asn Leu Leu Arg Tyr Phe Asp Tyr Ser Ala Gly Thr Glu Gly Phe 340 345 350 Asp Leu Val Pro Pro Ser Thr Gly Cys Ser Met Tyr Gly Arg Tyr Glu 355 360 365 Ile Gly Leu Leu Cys Leu Gly Met Met His Phe Arg Phe Gly His Pro 370 375 380 Asn Leu Ala Leu Glu Val Leu Thr Glu Ala Val Arg Val Ser Gln Gln 385 390 395 400 Leu Ser Asn Asp Thr Cys Leu Ala Tyr Thr Leu Ala Ala Met Ser Asn 405 410 415 Leu Leu Ser Glu Met Gly Ile Ala Ser Thr Ser Gly Val Leu Gly Ser 420 425 430 Ser Tyr Ser Pro Val Thr Ser Thr Ala Ser Ser Leu Ser Val Gln Gln 435 440 445 Arg Val Tyr Ile Leu Leu Lys Glu Ser Leu Arg Arg Ala Asp Ser Leu 450 455 460 Lys Leu Arg Arg Leu Val Ala Ser Asn His Leu Ala Met Ala Lys Phe 465 470 475 480 Glu Leu Met His Val Gln Arg Pro Leu Leu Ser Phe Gly Pro Lys Ala 485 490 495 Ser Met Arg His Lys Thr Cys Pro Val Ser Val Cys Lys Glu Ile Arg 500 505 510 Leu Gly Ala His Leu Ile Ser Asp Phe Ser Ser Glu Ser Ser Thr Met 515 520 525 Thr Ile Asp Gly Ser Leu Ser Ser Ala Trp Leu Lys Asp Leu Gln Lys 530 535 540 Pro Trp Gly Pro Pro Val Ile Ser Pro Asp Ser Gly Ser Arg Lys Ser 545 550 555 560 Ser Thr Phe Phe Gln Leu Cys Asp His Leu Val Ser Ile Pro Gly Ser 565 570 575 Val Ser Gln Leu Ile Gly Ala Ser Tyr Leu Leu Arg Ala Thr Ser Trp 580 585 590 Glu Leu Tyr Gly Ser Ala Pro Met Ala Arg Met Asn Thr Leu Val Tyr 595 600 605 Ala Thr Leu Phe Gly Asp Ser Ser Ser Ser Ser Asp Ala Glu Leu Ala 610 615 620 Tyr Leu Lys Leu Ile Gln His Leu Ala Leu Tyr Lys Gly Tyr Lys Asp 625 630 635 640 Ala Phe Ala Ala Leu Lys Val Ala Glu Glu Lys Phe Leu Thr Val Ser 645 650 655 Lys Ser Lys Val Leu Leu Leu Lys Leu Gln Leu Leu His Glu Arg Ala 660 665 670 Leu His Cys Gly Asn Leu Lys Leu Ala Gln Arg Ile Cys Asn Glu Leu 675 680 685 Gly Gly Leu Ala Ser Thr Ala Met Gly Val Asp Met Glu Leu Lys Val 690 695 700 Glu Ala Ser Leu Arg Glu Ala Arg Thr Leu Leu Ala Ala Lys Gln Tyr 705 710 715 720 Ser Gln Ala Ala Asn Val Ala His Ser Leu Phe Cys Thr Cys His Lys 725 730 735 Phe Asn Leu Gln Ile Glu Lys Ala Ser Val Leu Leu Leu Leu Ala Glu 740 745 750 Ile His Lys Lys Ser Gly Asn Ala Val Leu Gly Leu Pro Tyr Ala Leu 755 760 765 Ala Ser Ile Ser Phe Cys Gln Ser Phe Asn Leu Asp Leu Leu Lys Ala 770 775 780 Ser Ala Thr Leu Thr Leu Ala Glu Leu Trp Leu Gly Leu Gly Ser Asn 785 790 795 800 His Thr Lys Arg Ala Leu Asp Leu Leu His Gly Ala Phe Pro Met Ile 805 810 815 Leu Gly His Gly Gly Leu Glu Leu Arg Ala Arg Ala Tyr Ile Phe Glu 820 825 830 Ala Asn Cys Tyr Leu Ser Asp Pro Ser Ser Ser Val Ser Thr Asp Ser 835 840 845 Asp Thr Val Leu Asp Ser Leu Arg Gln Ala Ser Asp Glu Leu Gln Ala 850 855 860 Leu Glu Tyr His Glu Leu Ala Ala Glu Ala Ser Tyr Leu Met Ala Met 865 870 875 880 Val Tyr Asp Lys Leu Gly Arg Leu Asp Glu Arg Glu Glu Ala Ala Ser 885 890 895 Leu Phe Lys Lys His Ile Ile Ala Leu Glu Asn Pro Gln Asp Val Glu 900 905 910 Gln Asn Met Ala 915 5 1053 DNA Arabidopsis thaliana CDS (1)..(1053) 5 atg att ctt cca ttt tcg aca cag ttc act tgc cct gtt cag gat aat 48 Met Ile Leu Pro Phe Ser Thr Gln Phe Thr Cys Pro Val Gln Asp Asn 1 5 10 15 gga ttc agc cct tct tca cta ctt tcc cat tgc aaa aga gat cgt ttt 96 Gly Phe Ser Pro Ser Ser Leu Leu Ser His Cys Lys Arg Asp Arg Phe 20 25 30 gaa gtt act tcg ttg aga tat gac tct ttt ggt tcg gtg aag att gcc 144 Glu Val Thr Ser Leu Arg Tyr Asp Ser Phe Gly Ser Val Lys Ile Ala 35 40 45 tcc tcc tcg aag tgg aat gtt atg agg tcg agg aga aat gtg aaa gct 192 Ser Ser Ser Lys Trp Asn Val Met Arg Ser Arg Arg Asn Val Lys Ala 50 55 60 ttt ggg tta gtt gat aaa ctt ggg aag aag gta tgg aga aag aaa gaa 240 Phe Gly Leu Val Asp Lys Leu Gly Lys Lys Val Trp Arg Lys Lys Glu 65 70 75 80 gaa gat agt gac agt gaa gat gag gaa gat gaa gtg aaa gaa gag acc 288 Glu Asp Ser Asp Ser Glu Asp Glu Glu Asp Glu Val Lys Glu Glu Thr 85 90 95 ttt ggc ggg aaa gaa gcg agt ctc gat gat cca gta gag aga cgg gaa 336 Phe Gly Gly Lys Glu Ala Ser Leu Asp Asp Pro Val Glu Arg Arg Glu 100 105 110 tgg agg aag acg ata aga gag gtg att gat aag cat cct gat att gaa 384 Trp Arg Lys Thr Ile Arg Glu Val Ile Asp Lys His Pro Asp Ile Glu 115 120 125 gaa gac gaa gag att gat atg gtt gag aag agg agg aag atg cag aag 432 Glu Asp Glu Glu Ile Asp Met Val Glu Lys Arg Arg Lys Met Gln Lys 130 135 140 ctt ctt gct gat tac cca ctt gtt gtg aat gaa gag gat cct aat tgg 480 Leu Leu Ala Asp Tyr Pro Leu Val Val Asn Glu Glu Asp Pro Asn Trp 145 150 155 160 cct gaa gat gct gat ggt tgg ggg ttt agt ttc aat cag ttt ttt aat 528 Pro Glu Asp Ala Asp Gly Trp Gly Phe Ser Phe Asn Gln Phe Phe Asn 165 170 175 aag ata acg att aag aat gaa aag aag gaa gaa gag gat gat gat gaa 576 Lys Ile Thr Ile Lys Asn Glu Lys Lys Glu Glu Glu Asp Asp Asp Glu 180 185 190 gat aat gaa gga gat gat agt gag aag gag att gtt tgg caa gat gat 624 Asp Asn Glu Gly Asp Asp Ser Glu Lys Glu Ile Val Trp Gln Asp Asp 195 200 205 aac tat ata cgc ccg att aaa gat ctc aca acg gca gaa tgg gaa gag 672 Asn Tyr Ile Arg Pro Ile Lys Asp Leu Thr Thr Ala Glu Trp Glu Glu 210 215 220 gcg gtg ttc aaa gat att agt cct ctc atg gtc ctt gtt cac aac cgc 720 Ala Val Phe Lys Asp Ile Ser Pro Leu Met Val Leu Val His Asn Arg 225 230 235 240 tac aag agg cct aag gaa aac gaa aag ttc agg gaa gaa cta gag aag 768 Tyr Lys Arg Pro Lys Glu Asn Glu Lys Phe Arg Glu Glu Leu Glu Lys 245 250 255 gcg att caa gtt ata tgg aac tgt gga ctt cct tca cca aga tgt gtt 816 Ala Ile Gln Val Ile Trp Asn Cys Gly Leu Pro Ser Pro Arg Cys Val 260 265 270 gct gtt gat gct gtg gtt gag aca gat ttg gtc tct gct ctg aaa gta 864 Ala Val Asp Ala Val Val Glu Thr Asp Leu Val Ser Ala Leu Lys Val 275 280 285 tct gtc ttc cca gag atc atc ttc act aaa gcc ggg aag ata cta tac 912 Ser Val Phe Pro Glu Ile Ile Phe Thr Lys Ala Gly Lys Ile Leu Tyr 290 295 300 cgc gag aaa ggc att aga aca gca gat gag ctt tca aaa atc atg gct 960 Arg Glu Lys Gly Ile Arg Thr Ala Asp Glu Leu Ser Lys Ile Met Ala 305 310 315 320 ttc ttc tat tat gga gct gca aaa cca cct tgt ttg aat ggt gtc gtg 1008 Phe Phe Tyr Tyr Gly Ala Ala Lys Pro Pro Cys Leu Asn Gly Val Val 325 330 335 aat tcg caa gaa cag att cct tta gtc gat gta agt gtg aat taa 1053 Asn Ser Gln Glu Gln Ile Pro Leu Val Asp Val Ser Val Asn 340 345 350 6 350 PRT Arabidopsis thaliana 6 Met Ile Leu Pro Phe Ser Thr Gln Phe Thr Cys Pro Val Gln Asp Asn 1 5 10 15 Gly Phe Ser Pro Ser Ser Leu Leu Ser His Cys Lys Arg Asp Arg Phe 20 25 30 Glu Val Thr Ser Leu Arg Tyr Asp Ser Phe Gly Ser Val Lys Ile Ala 35 40 45 Ser Ser Ser Lys Trp Asn Val Met Arg Ser Arg Arg Asn Val Lys Ala 50 55 60 Phe Gly Leu Val Asp Lys Leu Gly Lys Lys Val Trp Arg Lys Lys Glu 65 70 75 80 Glu Asp Ser Asp Ser Glu Asp Glu Glu Asp Glu Val Lys Glu Glu Thr 85 90 95 Phe Gly Gly Lys Glu Ala Ser Leu Asp Asp Pro Val Glu Arg Arg Glu 100 105 110 Trp Arg Lys Thr Ile Arg Glu Val Ile Asp Lys His Pro Asp Ile Glu 115 120 125 Glu Asp Glu Glu Ile Asp Met Val Glu Lys Arg Arg Lys Met Gln Lys 130 135 140 Leu Leu Ala Asp Tyr Pro Leu Val Val Asn Glu Glu Asp Pro Asn Trp 145 150 155 160 Pro Glu Asp Ala Asp Gly Trp Gly Phe Ser Phe Asn Gln Phe Phe Asn 165 170 175 Lys Ile Thr Ile Lys Asn Glu Lys Lys Glu Glu Glu Asp Asp Asp Glu 180 185 190 Asp Asn Glu Gly Asp Asp Ser Glu Lys Glu Ile Val Trp Gln Asp Asp 195 200 205 Asn Tyr Ile Arg Pro Ile Lys Asp Leu Thr Thr Ala Glu Trp Glu Glu 210 215 220 Ala Val Phe Lys Asp Ile Ser Pro Leu Met Val Leu Val His Asn Arg 225 230 235 240 Tyr Lys Arg Pro Lys Glu Asn Glu Lys Phe Arg Glu Glu Leu Glu Lys 245 250 255 Ala Ile Gln Val Ile Trp Asn Cys Gly Leu Pro Ser Pro Arg Cys Val 260 265 270 Ala Val Asp Ala Val Val Glu Thr Asp Leu Val Ser Ala Leu Lys Val 275 280 285 Ser Val Phe Pro Glu Ile Ile Phe Thr Lys Ala Gly Lys Ile Leu Tyr 290 295 300 Arg Glu Lys Gly Ile Arg Thr Ala Asp Glu Leu Ser Lys Ile Met Ala 305 310 315 320 Phe Phe Tyr Tyr Gly Ala Ala Lys Pro Pro Cys Leu Asn Gly Val Val 325 330 335 Asn Ser Gln Glu Gln Ile Pro Leu Val Asp Val Ser Val Asn 340 345 350 7 909 DNA Arabidopsis thaliana CDS (1)..(909) 7 atg gcg atg ctt cag acg aat ctt ggc ttc att act tct ccg aca ttt 48 Met Ala Met Leu Gln Thr Asn Leu Gly Phe Ile Thr Ser Pro Thr Phe 1 5 10 15 ctg tgt ccg aag ctt aaa gtc aaa ttg aac tct tat ctg ggg ttt agc 96 Leu Cys Pro Lys Leu Lys Val Lys Leu Asn Ser Tyr Leu Gly Phe Ser 20 25 30 tat cgt tct caa gtt caa aaa ctg gat ttt tcg aaa agg gtt aat aga 144 Tyr Arg Ser Gln Val Gln Lys Leu Asp Phe Ser Lys Arg Val Asn Arg 35 40 45 agc tac aaa aga gat gct tta tta ttg tca atc aag tgt tct tca tcg 192 Ser Tyr Lys Arg Asp Ala Leu Leu Leu Ser Ile Lys Cys Ser Ser Ser 50 55 60 act gga ttt gat aat agc aat gtt gtt gtg aag gag aag agt gta tct 240 Thr Gly Phe Asp Asn Ser Asn Val Val Val Lys Glu Lys Ser Val Ser 65 70 75 80 gtg att ctt tta gct gga ggt caa ggc aag aga atg aaa atg agt atg 288 Val Ile Leu Leu Ala Gly Gly Gln Gly Lys Arg Met Lys Met Ser Met 85 90 95 cca aag cag tat ata cca ctt ctt ggt cag cca att gct ttg tat agc 336 Pro Lys Gln Tyr Ile Pro Leu Leu Gly Gln Pro Ile Ala Leu Tyr Ser 100 105 110 ttt ttc acg ttt tca cgt atg cct gaa gtg aag gaa att gta gtt gta 384 Phe Phe Thr Phe Ser Arg Met Pro Glu Val Lys Glu Ile Val Val Val 115 120 125 tgt gat cct ttt ttc aga gac att ttt gaa gaa tac gaa gaa tca att 432 Cys Asp Pro Phe Phe Arg Asp Ile Phe Glu Glu Tyr Glu Glu Ser Ile 130 135 140 gat gtt gat ctt aga ttc gct att cct ggc aaa gaa aga caa gat tct 480 Asp Val Asp Leu Arg Phe Ala Ile Pro Gly Lys Glu Arg Gln Asp Ser 145 150 155 160 gtt tac agt gga ctt cag gaa atc gat gtg aac tct gag ctt gtt tgt 528 Val Tyr Ser Gly Leu Gln Glu Ile Asp Val Asn Ser Glu Leu Val Cys 165 170 175 att cac gac tct gcc cga ccg ttg gtg aat act gaa gat gtc gag aag 576 Ile His Asp Ser Ala Arg Pro Leu Val Asn Thr Glu Asp Val Glu Lys 180 185 190 gtc ctt aaa gat ggt tcc gcg gtt gga gca gct gta ctt ggt gtt cct 624 Val Leu Lys Asp Gly Ser Ala Val Gly Ala Ala Val Leu Gly Val Pro 195 200 205 gct aaa gct aca atc aag gag gtc aat tct gat tcc ctt gtg gtg aaa 672 Ala Lys Ala Thr Ile Lys Glu Val Asn Ser Asp Ser Leu Val Val Lys 210 215 220 act ctc gac aga aaa acc cta tgg gaa atg cag aca cca cag gtg atc 720 Thr Leu Asp Arg Lys Thr Leu Trp Glu Met Gln Thr Pro Gln Val Ile 225 230 235 240 aaa cca gag cta ttg aaa aaa ggt ttc gag ctt gtg aaa agt gaa ggt 768 Lys Pro Glu Leu Leu Lys Lys Gly Phe Glu Leu Val Lys Ser Glu Gly 245 250 255 cta gag gta aca gat gac gtt tcg att gtt gaa tac ctc aag cat cca 816 Leu Glu Val Thr Asp Asp Val Ser Ile Val Glu Tyr Leu Lys His Pro 260 265 270 gta tat gtc tct caa gga tct tat aca aac atc aag gtt aca aca cct 864 Val Tyr Val Ser Gln Gly Ser Tyr Thr Asn Ile Lys Val Thr Thr Pro 275 280 285 gat gat tta ctg ctt gct gag aga atc ttg agc gag gac tca tga 909 Asp Asp Leu Leu Leu Ala Glu Arg Ile Leu Ser Glu Asp Ser 290 295 300 8 302 PRT Arabidopsis thaliana 8 Met Ala Met Leu Gln Thr Asn Leu Gly Phe Ile Thr Ser Pro Thr Phe 1 5 10 15 Leu Cys Pro Lys Leu Lys Val Lys Leu Asn Ser Tyr Leu Gly Phe Ser 20 25 30 Tyr Arg Ser Gln Val Gln Lys Leu Asp Phe Ser Lys Arg Val Asn Arg 35 40 45 Ser Tyr Lys Arg Asp Ala Leu Leu Leu Ser Ile Lys Cys Ser Ser Ser 50 55 60 Thr Gly Phe Asp Asn Ser Asn Val Val Val Lys Glu Lys Ser Val Ser 65 70 75 80 Val Ile Leu Leu Ala Gly Gly Gln Gly Lys Arg Met Lys Met Ser Met 85 90 95 Pro Lys Gln Tyr Ile Pro Leu Leu Gly Gln Pro Ile Ala Leu Tyr Ser 100 105 110 Phe Phe Thr Phe Ser Arg Met Pro Glu Val Lys Glu Ile Val Val Val 115 120 125 Cys Asp Pro Phe Phe Arg Asp Ile Phe Glu Glu Tyr Glu Glu Ser Ile 130 135 140 Asp Val Asp Leu Arg Phe Ala Ile Pro Gly Lys Glu Arg Gln Asp Ser 145 150 155 160 Val Tyr Ser Gly Leu Gln Glu Ile Asp Val Asn Ser Glu Leu Val Cys 165 170 175 Ile His Asp Ser Ala Arg Pro Leu Val Asn Thr Glu Asp Val Glu Lys 180 185 190 Val Leu Lys Asp Gly Ser Ala Val Gly Ala Ala Val Leu Gly Val Pro 195 200 205 Ala Lys Ala Thr Ile Lys Glu Val Asn Ser Asp Ser Leu Val Val Lys 210 215 220 Thr Leu Asp Arg Lys Thr Leu Trp Glu Met Gln Thr Pro Gln Val Ile 225 230 235 240 Lys Pro Glu Leu Leu Lys Lys Gly Phe Glu Leu Val Lys Ser Glu Gly 245 250 255 Leu Glu Val Thr Asp Asp Val Ser Ile Val Glu Tyr Leu Lys His Pro 260 265 270 Val Tyr Val Ser Gln Gly Ser Tyr Thr Asn Ile Lys Val Thr Thr Pro 275 280 285 Asp Asp Leu Leu Leu Ala Glu Arg Ile Leu Ser Glu Asp Ser 290 295 300 9 1404 DNA Arabidopsis thaliana 9 atggcgtcct catccctttc ccctgctact caggttcact cttgatcgtt cgaatcgaaa 60 caatcgggtc tcttcgtata ggaatttggg tttttgaaag tttggttttt ttttttggtg 120 gcagcttggt tctagcagaa gtgctttgat ggcgatgtca agtgggttgt ttgtgaagcc 180 aacgaagatg aatcatcaaa tggttagaaa agagaagatt ggattgagaa tttcttgtca 240 agcgtcgagt attccagcag acagagttcc agatatggaa aagaggaaga ctttgaatct 300 tcttcttctt ggggctcttt ctctacctac tggctacatg cttgtccctt acgctacctt 360 ctttgttcct cctgggtgag attgctttta ctctgttcct cgatttgatt tcatttttcg 420 aatgagattc atgggttttc ttttctttta ctgtatatga taagagtccc acatcgttaa 480 ggtagaggag agagtgttgg tatatgtgtt cattggttcc tctccccata acctaggttc 540 ccatggatct tatgagctag acctcaactc tatcagtata atgattgcat tagatagata 600 aaatctagag actcttttgt attgtggtga agtttgaatt tactcttgta tgttgttact 660 ttgttagctt atggttttga ttcaatcgaa tatgtagtta tgaataagta tatggcttct 720 ttggtgttac tgagatgtga gattaatgag acttgatata gtttattttc gttcgttggt 780 cttggtttat aatctctacg ttgttttgaa aatgcagaac cggaggtgga ggtggtggta 840 ctccagccaa ggatgccctt ggaaacgatg tagttgcagc ggaatggctt aagactcatg 900 gtcccggtga ccgaaccttg acccaaggat taaaggtaag atcatgaaca acttatttag 960 ccctccttga aacactgctt taagaactca ataaaggttc ttgcatctca atgggtgtgg 1020 attatttgtt atagggagat ccgacttacc tagttgtaga gaacgacaag actctagcga 1080 catacggtat caacgcagtg tgcactcatc ttggatgtgt tgtgccatgg aacaaagctg 1140 agaacaagtt tctatgtcct tgccatggat cccaatacaa cgcccaagga agagtcgtta 1200 gaggtccagc cccattggta agtcaaatac attgcattct ttcttttttg cgtatctcta 1260 aagtggaccg tctaaagtaa ccttaatttc atggaacagt cgctagcgtt ggctcacgcg 1320 gatatagatg aagctgggaa ggttcttttt gttccatggg tggaaactga cttcaggact 1380 ggtgatgctc catggtggtc ttaa 1404 10 8002 DNA Arabidopsis thaliana 10 atggccggat taacgagaac ggccggtgct tttgcggtaa ctccacacaa gatctccgtt 60 tgcattctcc tgcagatata cgctccttcc gctcagatgt ctcttccttt tcctttctct 120 tccgttgctc agcacaaccg cctcggcctc tacttgctct ctcttactaa ggtttcactc 180 ttcgatttct ctcgaattgc aatatcttgt agttttttga ggtttaattc agcaatgtct 240 tttgattctt ctgagtttcc tggcccgtct ctcatacccg aagttgtatc gagtttcttt 300 attgcatttc ttagagcttc tttgctaatt tctgagattg atttcagcga tgtctgttgt 360 aacgatttca cggctcaaaa ctcttgtaac ttatgttgat ttcatgagtg aataatgaat 420 ttgctggttt cgtgtgctaa caaactgttc ttttccctga cgagattgat attcatgctg 480 cagtcttgcg atgatatatt tgagccgaag ctggaaaagc tcatcaacca gttgagggaa 540 gttggtgaag agatggacgc gtggctaact gaccatttaa ctaatagatt ttcctctttg 600 gcttcaccag atgatctatt aaatttcttt aatgacatgc gaggttagct acttgcttgc 660 ttaaggtctg ttttttcttc ttttccatca tgtttgactc aaattagtta tcatttcttt 720 gtatgtagga atacttggga gccttgattc aggagtcgtg caagatgatc agattatttt 780 ggatcccaac agcaacttgg gaatgtttgt tcgtcgttgc attttggcat tcaacctttt 840 atcgttcgag gtattgagct ttttacgttt gtgttattct tgttttattt tgagactgaa 900 taatattaat attattcaac atgacgtgta gtttgaggtt ttgcttgaag gacgcggatt 960 tgttagcttt ttcgtcagcc acttaattgt gtctaaacat ctatgtgaag tttgtttgtt 1020 tcttgaccca ctacaatgga gtacttgtct aggctagact taaatgtaga cgcatgtata 1080 ggtacattaa aatatgatat ctaatggcag aactctggct ggtctggttg aagtgtaata 1140 tctagctgta gtgttcttga ttgctgcaga aacacataag tgaatgtttg aaagaaccgt 1200 ccatttcatt gatatgtgca gggagtttgt catctttttt caagtattga agattactgc 1260 aaagaagccc attcaagctt tgctcagttt ggtgcaccta ataataatct ggagtcatta 1320 atacaatatg atcagatgga tatggagaat tatgcaatgg ataaaccaac tgaagaaata 1380 gagtttcaga aaactgctag tggaattgtc ccttttcacc ttcatacacc agattcactt 1440 atgaaagcga cagaaggtat cccttatggt attcacccaa ttcagagaaa acctagagtt 1500 tttcagagtc tcaggtgcta gaaaactatg aggaaataga gttttatatt catgatcagg 1560 ctgcaattta gctgccactt ttctcatata tgtttggtaa tgtatttttg ctgatgatta 1620 gcatcagtca taggatcaac gttttatcaa agtctttcga tgtgtttgta tgaagctttt 1680 tcagactttt cttttttaac agttgggctt ctgaagatat ttcccttttt cccttttgta 1740 atgtagaaac gatctactaa gaaattaatt ttgtcggttt tttatatttt tgtttcattc 1800 ttatgctttg acactcactt gtcttgtaag gatctcaatg ttataataat gcacagaaca 1860 catgaaacat tgcaggtttg ctacataata ggaaggaaac atcaaggacc agcaagaaag 1920 atacagaagc tactccagtt gctcgtgcct caacaagtac acttgaggaa tctctggtag 1980 atgagtcatt attccttcgg acaaatttgc agatacaagg ctttttaatg gaacaggccg 2040 atgcaattga aatgtaaagt ctctttttcc ttagtattta tatatttctg tcataagtat 2100 tgctagtttt taaactttcc atccccttgt aatttgatgt cagccatgga agttcaagtt 2160 cattctcttc aagttccatc gaaagtttcc ttgatcagct tcagaaatta gcccctgaac 2220 tgcatcgtgt aatgtcttgc aattttgtat tttattttca gttaaatcag aagttaatca 2280 ttctttcagt gctttgattc ttacatcacg acatgcaaca atgcaggttc actttttgcg 2340 ttacttgaat aaacttcaca gtgatgacta ctttgctgct ttggataatc tcctccgtta 2400 ctttgattac aggtaatctg tactttttgc taattttctc tgtttcctaa aaatttatgt 2460 taagaataat attactgccg aaatcgttta ttgtttgcat gtgactcctg taaatatcca 2520 gttgtgtttg cctgttagat atttggcact aggaacgttg tttcaaacat gagctgatcc 2580 atgatccatt gccttaaagt gcagggactg agggatttga ccttgttcct ccttcaactg 2640 gctgcagcat gtatggaagg tacgagattg gtttgctatg tctgggaatg atgcatttcc 2700 gatttgggca tcctaatctg gctctagagg tgagaattca ttctcccgag agcttctata 2760 gctaaaaact tgttcttgac gtttagaatc gtctcttgat agaccaaatt gtactttgga 2820 tagtttactg ggctgttggt gaggctaata ataagagatg cccatttttg atgtatcttg 2880 agtgagttaa ccccacacta ttcttcacct tagatcatac ttctgtgtat atacagattt 2940 tactcgtagg atgctttaga aagttgaaaa tttgctcatg caagcaatga tgtgtttgag 3000 gtggggcaga cttaatgcga tttttttgct atcaacttca catttataga gtcttatcat 3060 tgtttggttt gctcataata tggtttctcc ttaaccacag gttttgacag aagctgtgcg 3120 tgtatcacag caggtatatt ctcactttat catccgctgt tagttcagtt tatagtactt 3180 ttggaagaaa gtggcgtgtt tgacgaattc ctgcacgatg tttaggttta tgaaagtttc 3240 cttaagctga atcttgagct agtttcttct attgttatat caccaagttc tgcaaagcga 3300 tttttagggt gtctcatggg tcagtgcatt cttaccagat atgaaaattt gttgtattcg 3360 gaacatgcta gaaggatcac atctctattg tttggcttta ttatgcaatt cataagcaac 3420 actaaagctt gcattgtttg aagagacgtt tgtacaagct gtaattcatt tatgttaatt 3480 aaaacttttt ttttttttaa ttctgcagct tagtaatgat acttgtctag catatacgct 3540 agcagcaatg agcaacttgt tatcggaaat gggcattgca agtacctccg gtgttctcgg 3600 atcctcatac tcacccgtca ctagcactgc gtcttcatta tctgtacaac aaagagtgta 3660 catacttttg aaagagtctt tgaggagagc tgacagtcta aagttaagac gcttagtggc 3720 ttctaatcat cttgcgatgg ctaaatttga gttgatggta aagtctttac ttaactgtga 3780 tgaacatagc tctttcctta tttatttatg atatatctct gatctgtcgt gagaatctgg 3840 ggtatgattt ttcatttaga acctggtgtg tctctcatat cttttgatgc acaaagaacc 3900 tgtgaattat cactgtgctc gcagcaattc tagttcttta tagtactcga aaaaatgtta 3960 acatgtgagg gaaatatata tagaagaatt tgtgacgtcg cttatgagaa aaagaaaaca 4020 caaaagaaaa gtgatgaagc acatttgata gatagatagg aagaagctgg agcaattaaa 4080 gttttttttg tttccataga aaattatttc agcgtctgaa ttttctaatg agaaaaatgt 4140 gatgtataga gaaagaaatc accagttttc ctttggcatg tgtatctatt tcatctcttt 4200 ccttgttttt tgtcctgtaa ctagacatac aaagcttgaa ttttcacgag ctttttcatt 4260 caaattattt gtttgctttg tttgttctgc tagccctata tatagggcat tgtcaactaa 4320 cttagacaaa atccatatac ttaagtttct actcttttga tatacgccaa atcaaaatag 4380 ttattttgtt gtaatacagc atgtgcaaag gcctctactg tcatttggtc ccaaagcttc 4440 tatgcgtcac aaaacttgtc cagttagtgt ctgcaaggta ctgaatatac cacccaactc 4500 ttttggaaat atttacatct ccgtttgagg tctaatcagt tcttgaatat tccatgtcag 4560 gaaataagac taggggcaca cctaatcagc gacttttctt ctgaaagctc tacaatgaca 4620 attgatggtt ctctaagctc ggcttggctt aaagacttgc aaaaaccatg gggtccacct 4680 gtgatttccc cagactccgg ttctagaaaa agttcaactt tttttcaact ctgtgatcat 4740 ttggtctcaa ttcctggatc cgtgtcacaa ttaataggtg cttcttattt actccgggct 4800 acttcatggg agttatatgg caggtaagat tgatatcgag tttctgttgg aagatgattg 4860 ttactttctt agaaagctct cctgcattgt tttctgagta ttgactagca ttaactaggg 4920 aagaattgtt tattgcaatt ttgatgtagg cactttcttc cacacaattt ccatatacct 4980 gtgttgtttc agttgactgt ttgctaggtg ctgcatattc ccattcaact atgtcttaga 5040 aagaaatttt ttgtgaggat attacttgat tgtgatatcg cttgttatga acttaaaatt 5100 gcgtaaacta ctagtatatg agtctccctc atgtattcca ctatgatgaa gtatgctaat 5160 ttataagtta gaactccata gcattgtctt cctttcgctt ttgatgctta caatacctat 5220 ttcaaagatg aaatcatgat catctcactt atctccactt gctgttactt tatctgtgtt 5280 tccatcttta ttgtattttc ttattcctgg gaggttattt taactcatct tatcacaatt 5340 tcttccatct cctttagaag ctaatgcatt tcactttaag tatgataact agaaagacca 5400 tagtttaatt tctcactgct cggtgcacca aaaaaaaaaa caatttctca ctgctcaaac 5460 ccttaatgtt gaagaagtca ttacttatac agtttcttct ttactgattt ctaacaacta 5520 aacatatgaa cttgacgatt tttttgtact ttattaattc atttgttttc tttagcgctc 5580 ccatggctcg gatgaatacc ttggtgtatg caactttatt cggtgactct tctaggtaat 5640 ttcctgtaga tatatcaatt gcctgttcta ctctcaattc ctttccctct ccgggtctct 5700 cgctcaccct ttctagctct ttttctttct aaaacggtct cttacattcg ttttgagtga 5760 ttttgatata tttcttacat ttgcagttcg tctgacgcag agttagcata cttgaagctc 5820 attcaacatt tggcactata taagggatac aaaggttaga atctgttgct caaggtcatg 5880 cgcaccttgt tatacatgtc tcagtatctc aagtcacaaa ttaacctata caaagtacct 5940 tgatgatcct tcgcaataaa aagttatagc tgaatattat gtttctgtag aatgaataag 6000 aaggccatgt cagtgtacaa aagtcagaac tttcttctat tttttctgat catgttcata 6060 tgttgctgct tctactgcat gcagggtatt tatttagatc ttgctccatg gtttttcttt 6120 tcctatgttg acctattttc tgaaccaatg tttttctaga tgcctttgct gctcttaagg 6180 tcgcagagga aaagttctta accgtatcga aatcaaaagt attgttgctc aagttgcaac 6240 tactacatga gcgtgccttg cattggtaag gttccctacc tcctatgttg aacatgcact 6300 caagttgcaa ccaacaaagt tgatccatta tctgctctct gtgatcactt actatattgt 6360 ttaagggtta tataaattca aaacagaggc ttcctagtgt tctggaattg ttatatctta 6420 agttggctgt tctaactttg ttaaatccat tagtgggaat ttaaaactag ctcaacgaat 6480 atgtaatgag ctaggaggct tggcatcaac agccatgggt gtagacatgg agctaaaagt 6540 agaagcaagt cttcgtgaag ctcggacttt gcttgcagca aaacagtata gccaggtttg 6600 taaaaaagtc ttattggcca tctgtttttt aagcccatgt gatatattta tttcaggcca 6660 tgtgctatgg caatgttagt gtgtcttcat tttgctttgg tactacgttc tctacaggca 6720 gcaaatgtgg cacactccct cttctgcaca tgtcacaaat tcaatttgca aatcgaaaag 6780 gcgtctgttc ttcttctgct cgcagagatc cataaggtaa gacatggcta caagaaatta 6840 ctagcttgag agcacattga ttggaatctg atattccatt gaaaatatgt tccgccttat 6900 caaattgcaa tcaaaacttt tttttttttt ctggaaattt gttccgttcc atgcatacat 6960 agctaacatc catctatatg ttctgaatgt gctctttctt tgctgattgc tcttagacta 7020 catagaaaag tgttgtacta tggtattatt tctaatcttt tattggtttt ctgaccggcg 7080 gccttcacct ttcgctttcc cctttttggg ttctcagaag tcaggaaatg ctgtcctggg 7140 tcttccatat gcgctggcaa gcatctcgtt ttgccagtca ttcaacttgg atcttctcaa 7200 agcatcagct actctcactc tggccgagct ttggcttggt cttggatcaa atcataccaa 7260 acgagcatta gaccttttgc atggggcttt ccctatgatt cttggccatg gaggtttgga 7320 gttgcgtgct cgagcttaca tctttgaagc aaactgctat ctatctgatc caagttcttc 7380 aggtagcttt tgtgtcttgt actgctttga cgaggatgtt cagaacataa ttgaatagag 7440 cctgactgtt atggataaag gttgatcttg ttagttgcga gtttctaatt tgtttactgt 7500 tgttaccagt ttccacagat tctgacactg tcttggattc tctaaggcaa gcttcagatg 7560 agcttcaagc tttggaggta taacttatgt caataagagt gctatgtagc tttttgttca 7620 aaaacaaacc tgagcactat attatcgcca catcattgca taagagatga acaaacaaac 7680 ctgagcacta tataattgcc acaacattgc ataagagatg ctatgttgct tttgattcga 7740 aaataacacc aaactcttat aactaggttc agcttcgttt agctcatcat aactctccta 7800 gtaccgtgat ttactgttgg ttcacttagt agttgtgttt attgtgcagt accatgaact 7860 ggcagcggaa gcctcgtact taatggccat ggtatatgac aagttgggac ggcttgatga 7920 gagggaagaa gctgcgtctt tgtttaagaa acatatcata gctctcgaga accctcaaga 7980 tgtggaacaa aacatggcat ga 8002 11 1303 DNA Arabidopsis thaliana 11 aatctgtctc aaccacagca tcaacagcaa cacactggaa gtaataatga cccaggaggt 60 ttaagcacag accacaacat acagcaagat tcaataatgg ttttagaaat ggtaaagaga 120 tgtccacaac ttactcttgg tgaaggaagt ccacagttcc atataacttg aatcgccttc 180 tctagttctt ccctgaactt ttcgttttcc ttaggcctga aattaaccag tactcgtatc 240 aaatttcaac aacacagcta gaactacaat agtcaggatt gaaagatccc atcagcaaga 300 atagagagcc agagacacct atctacagat acaataaagg ttgaaacttc taaacactaa 360 agtgagttga atccataaac aaaacagccc aataacaacc aaagtcagcg atcagcataa 420 aaactaatca acaaaatgtg tagcattttg acaagcataa gtaagacaaa ctgttaccac 480 caacctcttg tagcggttgt gaacaaggac catgagagga ctaatatctt tgaacaccgc 540 ctcttcccat tctgccgttg tgagatcttt aatcgggcgt atatagttat catcttgcca 600 aacaatctcc ttctcactat catctccttc actatcttca tcatcatcct cttcttcctt 660 cttttcattc ttaatcgtta tcttattaaa aaactgattg aaactaaacc cccaaccatc 720 agcatcttca ggccaattag gatcctcttc attcacaaca agtgggtaat cagcaagaag 780 cttctgcatc ttcctcctct tctcaaccat atcaatctct tcgtcttctt caatatcagg 840 atgcttatca atcacctctc ttatcgtctt cctccattcc cgtctctcta ctggatcatc 900 gagactcgct tctttcccgc caaaggtctc ttctttcact tcatcttcct catcttcact 960 gtcactatct tcttctttct ttctccatac cttcttccca agtttatcaa ctaacccaaa 1020 agctttcaca tttctcctcg acctcataac attccacttc tagcaatacc tcagagcaaa 1080 aaaggtttca aattagccaa ttcgcaactt aagtaatcaa aattagagag ttaggtaatc 1140 agaaactaac cgaggaggag gcagtcttca ccgaaccaaa agagtcatat ctcaacgaag 1200 taacttcaaa acgatctctt ttgcaatggg aaagtagtga agaagggctg aatccattat 1260 cctgaacagg gcaagtgaac tgtgtcgaaa atggaagaat cat 1303 12 2071 DNA Arabidopsis thaliana 12 atggcgatgc ttcagacgaa tcttggcttc attacttctc cgacatttct gtgtccgaag 60 cttaaagtca aattgaactc ttatctgtgg tttagctatc gttctcaagg caatttctca 120 tactctttat atacgttcaa accaatgaat ctctggttcg gtgaggagat ttttgattgt 180 ttctgaatgt tgatttttca gttcaaaaac tggatttttc gaaaagggtt aatagaagct 240 acaaaagaga tgctttatta ttgtcaatca agtgttcttc atcgactgga tttgataatg 300 taagtgttag ttcaaaactt ggaatcactt agttttgaag ctttatctct agtttaagtt 360 ttttttgttt ttcagagcaa tgctgtgaat gtaagtttaa agttttgaat caactcaaat 420 tttgaagctt tttttttatc tctaatgttt ttcagagcaa tgttgttgtg aaggagaaga 480 gtgtatctgt gattctttta gctggaggtc aaggcaagag aatgaaagta agtgtttttg 540 ggatgaaagt ttcaaacttt atgctcgagt tttaattgtt tgttctgttg agttgtttgt 600 ttttctatgt aatgtgaata gatgagtatg ccaaagcagt acataccact tcttggtcag 660 ccaattgctt tgtataggta acatttctta cttgaaagtc agttgattgg ttaagtcttg 720 tatgttgttt gaatctgttt ctgataacta attacagtct cgtttgattg atggaagagt 780 tgtaatttga ctgttttttc attctgcgga atctacttgt ctattagtat tgaaccatct 840 ttgctttgtg ttcgttgaat tttatgtttc attgttgcag ctttttcacg ttttcacgta 900 tgcctgaagt gaaggaaatt gtagttgtat gtgatccttt tttcagagac atttttgaag 960 gttttttagt ctctctttct ctcattagtt atgtttttgg ttgagagatg ttccaaagat 1020 ttctctgagc ctctttttgt tttgtggtgt tctagaatac gaagaatcaa ttgatgttga 1080 tcttagattc gctattcctg gcaaagaaag acaagattct gtttacagtg gacttcaggt 1140 ttatacctcc gttgctttga ttaacatatc caatgaatca tttactcaaa agaaataatt 1200 ctacaacata ctgtctgttc tagttcatat aagcttcaag tttatttctc ataggaaatc 1260 gatgtgaact ctgagcttgt ttgtatccac gactctgccc gaccattggt gaatactgaa 1320 gatgtcgaga aggtatactt atgagaagta gcaaaagatt aaggagaatg aaacaaaatt 1380 ctcgaagtcc ctttagaatt tattgcactg ttgtattttg ttaaggtcct taaagatggt 1440 tccgcggttg gagcagctgt acttggtgtt cctgctaaag ctacaatcaa agaggtacaa 1500 aatcttaagt tagttttttt tttttttgtc atcaccacat tctcgatatt cgacttgttc 1560 cattttgcga tatgcaggtc aattctgatt cgcttgtggt gaaaactctc gacagaaaaa 1620 ccctatggga aatgcagaca ccacaggttt taaagtatac tcatgctttg ttgattgatt 1680 tttggtttca aaccacttca ataatgcatg ctttttctgt aggtgatcaa accagagcta 1740 ttgaaaaagg gtttcgagct tgtaaaaagg tttgtgaacc ttcaaatgat tttctgaagt 1800 gtggattaga atataaaaag attgattact atttgtgttc tatatctaaa acagtgaagg 1860 tctagaggta acagatgacg tttcgattgt tgaatacctc aagcatccag tttatgtctc 1920 tcaaggatct tatacaaaca tcaaggtaac aaaacactaa ttttgtgttt tttcgcagcc 1980 gtagaatgaa aacaaacttt ctcatccatt gcaggttaca acacctgatg atttactgct 2040 tgctgagaga atcttgagcg aggactcatg a 2071 13 18 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 13 ngttgwgnat wtsgwgnt 18 14 15 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 14 tgwgnagsan casag 15 15 16 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 15 agwgnagwan cawagg 16 16 16 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 16 sttgntastn ctntgc 16 17 15 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 17 ntcgastwts gwgtt 15 18 16 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 18 wgtgnagwan canaga 16 19 30 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 19 actagctcta ccgtttccgt ttccgtttac 30 20 30 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 20 ttacctcggg ttcgaaatcg atcgggataa 30 21 33 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 21 aaaatcggtt atacgataac ggtcggtacg gga 33 22 36 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 22 gggtcttgcg gatctgaata tatgttttca tgtgtg 36 23 37 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 23 taccgaagaa aaataccggt tcccgtccga tttcgac 37 24 34 DNA Artificial Sequence Description of Artificial Sequence oligonucleotide 24 ggatcgtatc ggttttcgat taccgtattt atcc 34 25 3236 DNA Arabidopsis thaliana CDS (1)..(1056) 25 atg gcc gga tta acg aga acg gcc ggt gct ttt gcg gta act cca cac 48 Met Ala Gly Leu Thr Arg Thr Ala Gly Ala Phe Ala Val Thr Pro His 1 5 10 15 aag atc tcc gtt tgc att ctc ctg cag ata tac gct cct tcc gct cag 96 Lys Ile Ser Val Cys Ile Leu Leu Gln Ile Tyr Ala Pro Ser Ala Gln 20 25 30 atg tct ctt cct ttt cct ttc tct tcc gtt gct cag cac aac cgc ctc 144 Met Ser Leu Pro Phe Pro Phe Ser Ser Val Ala Gln His Asn Arg Leu 35 40 45 ggc ctc tac ctg ctc tct ctt act aag tct tgc gat gat ata tat gag 192 Gly Leu Tyr Leu Leu Ser Leu Thr Lys Ser Cys Asp Asp Ile Tyr Glu 50 55 60 ccg aag ctg gaa aag ctc atc aac cag ttg agg gaa gtt ggt gaa gag 240 Pro Lys Leu Glu Lys Leu Ile Asn Gln Leu Arg Glu Val Gly Glu Glu 65 70 75 80 atg gac gcg tgg cta act gac cat tta act aat aga ttt tcc tct ttg 288 Met Asp Ala Trp Leu Thr Asp His Leu Thr Asn Arg Phe Ser Ser Leu 85 90 95 gct tca cca gat gat cta tta aat ttc ttt aat gac atg cga gga ata 336 Ala Ser Pro Asp Asp Leu Leu Asn Phe Phe Asn Asp Met Arg Gly Ile 100 105 110 ctt ggg agc ctt gat tca gga gtc gtg caa gat gat cag att att ttg 384 Leu Gly Ser Leu Asp Ser Gly Val Val Gln Asp Asp Gln Ile Ile Leu 115 120 125 gat ccc aac agc aac ttg gga atg ttt gtt cgt cgt tgc att ttg gca 432 Asp Pro Asn Ser Asn Leu Gly Met Phe Val Arg Arg Cys Ile Leu Ala 130 135 140 ttc aac ctt tta tcg ttc gag gga gtt tgt cat ctt ttt tca agt att 480 Phe Asn Leu Leu Ser Phe Glu Gly Val Cys His Leu Phe Ser Ser Ile 145 150 155 160 gaa gat tac tgc aaa gaa gcc cat tca agc ttt gct cag ttt ggt gca 528 Glu Asp Tyr Cys Lys Glu Ala His Ser Ser Phe Ala Gln Phe Gly Ala 165 170 175 cct aat aat aat ctg gag tca tta ata caa tat gat cag atg gat atg 576 Pro Asn Asn Asn Leu Glu Ser Leu Ile Gln Tyr Asp Gln Met Asp Met 180 185 190 gag aat tat gca atg gat aaa cca act gaa gaa ata gag ttt cag aaa 624 Glu Asn Tyr Ala Met Asp Lys Pro Thr Glu Glu Ile Glu Phe Gln Lys 195 200 205 act gct agt gga att gtc cct ttt cac ctt cat aca cca gat tca ctt 672 Thr Ala Ser Gly Ile Val Pro Phe His Leu His Thr Pro Asp Ser Leu 210 215 220 atg aaa gcg aca gaa ggt atc cct tat ggt ttg cta cat aat agg aag 720 Met Lys Ala Thr Glu Gly Ile Pro Tyr Gly Leu Leu His Asn Arg Lys 225 230 235 240 gaa aca tca agg acc agc aag aaa gat aca gaa gct act cca gtt gct 768 Glu Thr Ser Arg Thr Ser Lys Lys Asp Thr Glu Ala Thr Pro Val Ala 245 250 255 cgt gcc tca tca agt aca ctt gag gaa tct ctg gta gat gag tca tta 816 Arg Ala Ser Ser Ser Thr Leu Glu Glu Ser Leu Val Asp Glu Ser Leu 260 265 270 ttc ctt cgg aca aat ttg cag ata caa ggc ttt tta atg gaa cag gcc 864 Phe Leu Arg Thr Asn Leu Gln Ile Gln Gly Phe Leu Met Glu Gln Ala 275 280 285 gat gca att gaa atc cat gga agt tca agt tca ttc tct tca agt tcc 912 Asp Ala Ile Glu Ile His Gly Ser Ser Ser Ser Phe Ser Ser Ser Ser 290 295 300 atc gaa agt ttc ctt gat cag ctt cag aaa tta gcc cct gaa ctg cat 960 Ile Glu Ser Phe Leu Asp Gln Leu Gln Lys Leu Ala Pro Glu Leu His 305 310 315 320 cgt gtt cac ttt ttg cgt tac ttg aat aaa ctt cac agt gat gac tac 1008 Arg Val His Phe Leu Arg Tyr Leu Asn Lys Leu His Ser Asp Asp Tyr 325 330 335 ttt gct gct ttg gat aat ctc ctc cgt tac ttt gat tac agg gac tga 1056 Phe Ala Ala Leu Asp Asn Leu Leu Arg Tyr Phe Asp Tyr Arg Asp 340 345 350 gggatttgac cttgttcctc cttcaactgg ctgcagcatg tatggaaggt acgagattgg 1116 tttgctatgt ctgggaatga tgcatttccg atttgggcat cctaatctgg ctctagaggt 1176 tttgacagaa gctgtgcgtg tatcacagca gcttagtaat gatacttgtc tagcatatac 1236 gctagcagca atgagcaact tgttatcgga aatgggcatt gcaagtacct ccggtgttct 1296 cggatcctca tactcacccg tcactagcac tgcgtcttca ttatccgtac aacaaagagt 1356 gtacatactt ttgaaagagt ctttgaggag agctgacagt ctaaagttaa gacgcttagt 1416 ggcttctaat catcttgcga tggctaaatt tgagttgatg catgtgcaaa ggcctctact 1476 gtcatttggt cccaaagctt ctatgcgtca caaaacttgt ccagttagtg tctgcaagga 1536 aataagacta ggggcacacc taatcagcga cttttcttct gaaagctcta caatgacaat 1596 tgatggttct ctaagctcgg cttggcttaa agacttgcaa aaaccatggg gtccacctgt 1656 gatttcccca gactccggtt ctagaaaaag ttcaactttt tttcaactct gtgatcattt 1716 ggtctcaatt cctggatccg tgtcacaatt aataggtgct tcttatttac tccgggctac 1776 ttcatgggag ttatatggca gcgctcccat ggctcggatg aataccttgg tgtatgcaac 1836 tttattcggt gactcttcta gttcgtctga cgcagagtta gcatacttga agctcattca 1896 acatttggca ctatataagg gatacaaagg ttagaatctg ttgctcaagg tcatgcgcac 1956 cttgttatac aagtcacaaa ttaacctata caaagtacct tgatgatcct tcgcaataaa 2016 aagttatagc tgaatattat gtttctgtag aatgaataag aaggccatgt cagtgtacaa 2076 aagtcagaac tttcttctat tttttctgat catgttcata tgttgctgct tctactgcat 2136 gcagggtatt tatttagatc ttgctccatg ttttttcttt tcctatgttg acctattttc 2196 tgaaccaatg tttttctaga tgcctttgct gctcttaaag tcgcagagga aaagttctta 2256 accgtatcga aatcaaaagt attgttgctc aagttgcaac tactacatga gcgtgccttg 2316 cattggtaag gttccctacc tcctatgttg aacatgcact caagttgcaa ccaacaaagt 2376 tgatccatta tctgctctct gtgatcactt actatattgt ttaagggtta tataaattca 2436 aaacagaggc ttcctattgt tctggaattg ttatatctta agttggctgt tctaactttg 2496 ttaaatccat tagtgggaat ttaaaactag ctcaacgaat atgtaatgag ctaggaggct 2556 tggcatcaac agccatgggc gtagacatgg agctaaaagt agaagcaagt cttcgtgaag 2616 ctcggacttt gcttgcagca aaacagtata gccaggcagc aaatgtggca cactccctct 2676 tctgcacatg tcacaaattc aatttgcaaa tcgaaaaggc gtctgttctt cttctgctcg 2736 cagagatcca taagaagtca ggaaatgctg tcctgggtct tccatatgcg ctggcaagca 2796 tctcgttttg ccagtcattc aacttggatc ttctcaaagc atcagctact ctcactctgg 2856 ccgagctttg gcttggtctt ggatcaaatc ataccaaacg agcattagac cttttgcatg 2916 gggctttccc tatgattctt ggccatggag gtttggagtt gcgtgctcga gcttacatct 2976 ttgaagcaaa ctgctatcta tctgatccaa gttcttcagt ttccacagat tctgacactg 3036 tcttggattc tctaaggcaa gcttcagatg agcttcaagc tttggagtac catgaactgg 3096 cagcggaagc ctcgtactta atggcgatgg tatatgacaa gctgggacgg cttgatgaga 3156 gggaagaagc tgcgtctttg tttaagaaac atatcatagc tctcgagaac cctcaagatg 3216 tggaacaaaa catggcatga 3236 26 351 PRT Arabidopsis thaliana 26 Met Ala Gly Leu Thr Arg Thr Ala Gly Ala Phe Ala Val Thr Pro His 1 5 10 15 Lys Ile Ser Val Cys Ile Leu Leu Gln Ile Tyr Ala Pro Ser Ala Gln 20 25 30 Met Ser Leu Pro Phe Pro Phe Ser Ser Val Ala Gln His Asn Arg Leu 35 40 45 Gly Leu Tyr Leu Leu Ser Leu Thr Lys Ser Cys Asp Asp Ile Tyr Glu 50 55 60 Pro Lys Leu Glu Lys Leu Ile Asn Gln Leu Arg Glu Val Gly Glu Glu 65 70 75 80 Met Asp Ala Trp Leu Thr Asp His Leu Thr Asn Arg Phe Ser Ser Leu 85 90 95 Ala Ser Pro Asp Asp Leu Leu Asn Phe Phe Asn Asp Met Arg Gly Ile 100 105 110 Leu Gly Ser Leu Asp Ser Gly Val Val Gln Asp Asp Gln Ile Ile Leu 115 120 125 Asp Pro Asn Ser Asn Leu Gly Met Phe Val Arg Arg Cys Ile Leu Ala 130 135 140 Phe Asn Leu Leu Ser Phe Glu Gly Val Cys His Leu Phe Ser Ser Ile 145 150 155 160 Glu Asp Tyr Cys Lys Glu Ala His Ser Ser Phe Ala Gln Phe Gly Ala 165 170 175 Pro Asn Asn Asn Leu Glu Ser Leu Ile Gln Tyr Asp Gln Met Asp Met 180 185 190 Glu Asn Tyr Ala Met Asp Lys Pro Thr Glu Glu Ile Glu Phe Gln Lys 195 200 205 Thr Ala Ser Gly Ile Val Pro Phe His Leu His Thr Pro Asp Ser Leu 210 215 220 Met Lys Ala Thr Glu Gly Ile Pro Tyr Gly Leu Leu His Asn Arg Lys 225 230 235 240 Glu Thr Ser Arg Thr Ser Lys Lys Asp Thr Glu Ala Thr Pro Val Ala 245 250 255 Arg Ala Ser Ser Ser Thr Leu Glu Glu Ser Leu Val Asp Glu Ser Leu 260 265 270 Phe Leu Arg Thr Asn Leu Gln Ile Gln Gly Phe Leu Met Glu Gln Ala 275 280 285 Asp Ala Ile Glu Ile His Gly Ser Ser Ser Ser Phe Ser Ser Ser Ser 290 295 300 Ile Glu Ser Phe Leu Asp Gln Leu Gln Lys Leu Ala Pro Glu Leu His 305 310 315 320 Arg Val His Phe Leu Arg Tyr Leu Asn Lys Leu His Ser Asp Asp Tyr 325 330 335 Phe Ala Ala Leu Asp Asn Leu Leu Arg Tyr Phe Asp Tyr Arg Asp 340 345 350 

What is claimed is:
 1. An isolated DNA molecule comprising a nucleotide sequence encoding an amino acid sequence substantially similar to SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8.
 2. The DNA molecule of claim 1, wherein said nucleotide sequence is substantially similar to SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7.
 3. The DNA molecule according to claim 1, wherein said nucleotide sequence is a plant nucleotide sequence.
 4. The DNA molecule of claim 1, wherein the amino acid sequence has GT1209, GT1354, or GT0946 activity.
 5. A polypeptide comprising an amino acid sequence encoded by a nucleotide sequence identical or substantially similar to SEQ ID NO:3, SEQ ID NO:5, or SEQ ID NO:7.
 6. The polypeptide of claim 5, wherein said amino acid sequence is substantially similar to SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8.
 7. The polypeptide of claim 5, wherein said amino acid sequence has GT1209, GT1354, or GT0946 activity.
 8. A polypeptide comprising an amino acid sequence comprising at least 20 consecutive amino acid residues of the amino acid sequence of SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, wherein the amino acid sequence has GT1209, GT1354, or GT0946 activity.
 9. An expression cassette comprising a promoter operatively linked to a DNA molecule comprising a nucleotide sequence encoding an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8.
 10. A recombinant vector comprising an expression cassette according to claim
 9. 11. A host cell comprising a DNA molecule comprising a nucleotide sequence encoding an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8.
 12. A host cell according to claim 11, wherein said host cell is selected from the group consisting of an insect cell, a yeast cell, a prokaryotic cell and a plant cell.
 13. A plant or seed comprising a plant cell of claim
 12. 14. A plant of claim 13, wherein said plant is tolerant to an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity.
 15. A method comprising: a) combining a polypeptide comprising the amino acid sequence encoded by a DNA molecule comprising a nucleotide sequence encoding an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, or a homolog thereof, and a compound to be tested for the ability to interact with said polypeptide, under conditions conducive to interaction; and b) selecting a compound identified in step (a) that is capable of interacting with said polypeptide.
 16. The method according to claim 15, further comprising: c) applying a compound selected in step (b) to a plant to test for herbicidal activity; and d) selecting compounds having herbicidal activity.
 17. A compound identifiable by the method of claim
 15. 18. A compound having herbicidal activity identifiable by the method of claim
 16. 19. A process of identifying an inhibitor of GT1802, GT1209, GT1354, or GT0946 activity comprising: a) introducing a DNA molecule comprising a nucleotide sequence encoding an amino acid sequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID NO:8, and encoding a polypeptide having GT1802, GT1209, GT1354, or GT0946 activity, or a homolog thereof, into a plant cell, such that said sequence is functionally expressible at levels that are higher than wild-type expression levels; b) combining said plant cell with a compound to be tested for the ability to inhibit the GT1802, GT1209, GT1354, or GT0946 activity under conditions conducive to such inhibition; c) measuring plant cell growth under the conditions of step (b); d) comparing the growth of said plant cell with the growth of a plant cell having unaltered GT1802, GT1209, GT1354, or GT0946 activity under identical conditions; and e) selecting said compound that inhibits plant cell growth in step (d).
 20. A compound having herbicidal activity identifiable according to the process of claim
 19. 